Selection Criteria for Districts and Municipal Units for a Jewelry Workshop
A two-stage location selection methodology for Saint Petersburg: criteria, formulas, scoring scales, and model limitations.
The location selection methodology for Saint Petersburg is structured in three stages:
- district ranking (using district-level data only),
- municipal unit (MU) ranking within the selected districts,
- field/qualitative checks and a financial model for the shortlist.
Key principle: do not mix indicators from different levels of aggregation in a single district composite score.
MU-level data is used in stage 2 only and does not contribute to the district score.
Terms and Assumptions
- District – an administrative district of Saint Petersburg (aggregation level: “district”).
- MU – an intra-city municipal unit / municipal district (aggregation level: “mo”).
- Active workshop / organization – a record marked as active (
is_work = 1) in the organization dataset.
On zeros (0 workshops): instead of a “default average score,” smoothing is applied in the formulas (+ 0.5), and manual validation (maps/search) is required for any final shortlist entries.
Data Sources (CSV files used in calculations)
District level:
district_density_stats.csv– district population, number of active workshops, derived competition metrics.organisations_with_ids.csv(+ aggregation) and/ororganisations-by-district.csv– total and active organization counts for survival rate calculation.population-age-sex-structure-2024.csv– age cohorts for estimating the 30–65 target audience.working-age-population-female-2024.csv(+working-age-population-both-2024.csv) – number of women of working age.people-by-district-timeline.csv– population dynamics 2018–2025.average_salary_by_district.csv– average salary by district.commissioning-housing-by-district.csv– new housing commissioned (sum for 2021–2023 is used).
MU level:
mo_density_stats.csv– MU population and number of active workshops (for competitive pressure within a district).people-by-mo-2025.csvand/orpeople-by-mo-timeline.csv– MU population (for refinement/validation if needed).organisations-by-district-and-mo.csv/organisations-by-mo.csv– organizations by MU (for validation as needed).
Stage 1 – District Ranking (District-Level Criteria)
1. Group: Competitive Environment (District)
1.1. Competitor Density (Population per Active Workshop)
Reflects competitive pressure: how many residents share one active workshop. A higher value means lower competition.
Formula (with zero smoothing):
people_per_workshop = population / (workshops_working + 0.5)
Data source: district_density_stats.csv
Scoring scale (by people_per_workshop):
- above 40,000 – 3 points
- 25,000–40,000 – 2 points
- below 25,000 – 1 point
Weight: 20%
Industry context.
Competitive pressure is one of the key factors in industrial site selection models (Buxton, Kalibrate, GrowthFactor). Large chains assess not only competitor count but also market share, segmentation (direct vs. indirect), and intercepting competitors along the customer’s path. The current model uses only an aggregate metric — the number of active workshops per capita. If revenue data or segmentation data (repair shops / jewelry chains / pawnshops) become available, refining this criterion is recommended.
Thresholds of 40,000 / 25,000 were chosen with LLM assistance. Without a benchmark based on successful workshop revenue, proper calibration is not possible; it is recommended to validate these thresholds using an analogue method (based on actual revenue of 3–5 existing locations) in the future.
1.2. Business Survival Rate
The share of active organizations out of all registered ones. A high value indicates a more stable market environment.
Formula:
survival_rate = working_orgs / total_orgs, where working_orgs is the number of organizations with is_work = 1,
total_orgs is the total number of organizations in the district.
Data source:
organisations_with_ids.csv (aggregated to district level) and/or organisations-by-district.csv
Scoring scale:
- above 0.70 – 3 points
- 0.60–0.70 – 2 points
- below 0.60 – 1 point
Weight:
5%
Limitation: small sample.
Survival rate is calculated across jewelry workshops — ranging from 1 to 39 organizations per district (354 total city-wide). At these volumes, the closure of 2–3 locations can shift the indicator by 10–15%, reflecting individual circumstances (retirement, relocation) rather than systemic district properties. Large companies (Starbucks, McDonald’s) use general SMB survival statistics from tax registry data over 3–5 years, or vacancy rate (share of vacant commercial space), which better captures the health of the business environment.
Recommendation: If vacancy rate or overall business survival data by district becomes available — replace this criterion. Until then, the weight is reduced to 5% to limit the influence of noise.
2. Group: Demographics and Target Audience (District)
2.1. Target Audience Size (Ages 30–65)
Estimates the absolute market size within the key age range.
Formula (strictly based on 2024 data):
target_30_65 = sum(30–34, 35–39, 40–44, 45–49, 50–54, 55–59, 60–64)
(if a separate “65” group is unavailable, the 30–64 range is used as a reproducible approximation)
Data source:
population-age-sex-structure-2024.csv
Scoring scale:
- above 200,000 – 3 points
- 100,000–200,000 – 2 points
- below 100,000 – 1 point
Weight:
15%
Limitation: absolute size vs. density.
This criterion uses the absolute number of target audience residents in the district, not the density of the target audience within a trade area radius. This means large-population districts (Primorsky – 715k, Nevsky – 557k) systematically score higher. The industry standard (Placer.ai, ESRI Business Analyst) is to assess target audience density within a 15–20 minute drive-time zone from a specific address.
For the current stage (rough filtering from 18 to 8 districts), absolute size is acceptable, as Saint Petersburg districts are comparable in developed area. For stage 3 (shortlisting specific addresses), switching to drive-time analysis is recommended.
2.2. Share of Women of Working Age
Women are more likely to initiate a visit and may bring multiple items per visit. This criterion estimates the relative potential for repeat visits.
Formula (as in the original methodology):
female_working_share = female_working_age / total_population
Data sources:
- number of women of working age:
working-age-population-female-2024.csv - total population (year must match the chosen baseline year):
people-by-district-timeline.csv
Scoring scale (to avoid arbitrary thresholds):
- 3 points – top third of districts by
female_working_share(≥ 67th percentile) - 2 points – middle third (between the 33rd and 67th percentiles)
- 1 point – bottom third (≤ 33rd percentile)
Weight:
5%
2.3. Population Growth Dynamics (2018–2025)
Growing districts expand the customer base; negative dynamics signal a risk of declining demand.
Formula:
growth_2018_2025 = (pop_2025 - pop_2018) / pop_2018
Data source:
people-by-district-timeline.csv
Scoring scale:
- above +10% – 3 points
- 0% … +10% – 2 points
- below 0% – 1 point
Weight:
10%
3. Group: Purchasing Power (District)
3.1. Average Salary in the District
Resident income levels affect willingness to pay for jewelry repair and maintenance services.
Data source:
average_salary_by_district.csv
Scoring scale:
- above 140,000 ₽ – 3 points
- 120,000–140,000 ₽ – 2 points
- below 120,000 ₽ – 1 point
Weight:
10%
Limitation: salary of workers ≠ income of residents. Average salary by district reflects the earnings of people working at organizations within the district, not those living there. The Central District (159,976 ₽) scores high due to office concentration, even though many of its workers live in other districts and will seek services closer to home. Similarly, Krasnoselsky (108,531 ₽) is underrated, even though its residents may work in higher-paying districts.
The industry standard (Starbucks, ESRI) is median household income. If household income data or average cost per sq. m of housing becomes available, replacing this indicator is recommended.
3.2. Indirect Wealth Indicators – not calculated (data unavailable). To be added as a separate criterion if data becomes available.
4. Group: Infrastructure and Growth Potential (District)
4.1. Transit Accessibility – not calculated (data unavailable).
4.2. Parking Infrastructure – not calculated (data unavailable).
4.3. New Housing Commissioned (2021–2023)
New residential construction brings an influx of residents and increases potential demand.
Indicator:
housing_2021_2023 = sum(2021, 2022, 2023)
Data source:
commissioning-housing-by-district.csv (values in thousands of sq. m)
Scoring scale (thresholds in thousands of sq. m):
- above 500 – 3 points
- 200–500 – 2 points
- below 200 – 1 point
Units of measurement.
Data in the CSV is given in thousands of sq. m. Thresholds are compared directly against the sum: Primorsky = 2,478.7 thousand sq. m → 3 points. If one year contains a non-numeric value (e.g., a dash –), the sum is calculated from the available years.
Weight:
5%
5. District Ranking Formula (with correct handling of missing criteria)
To prevent “data unavailable” from automatically yielding a “middle score,” the district ranking is calculated using only available criteria, normalized by the sum of the used weights:
R_district = Σ(Bi × Wi) / Σ(Wi)
Bi– score (1/2/3) for the criterion,Wi– criterion weight (in percent),- the sum is taken only over criteria that were actually computed.
Coverage is also recorded: coverage = Σ(Wi_used) / Σ(Wi_planned)
Interpreting coverage. With current data, coverage = 70% (for most districts). Missing criteria represent factors that account for 20–25% of weight in industrial site selection models:
- pedestrian / vehicle traffic – the most significant gap; all major chains (Starbucks, McDonald’s, X5) include foot traffic and vehicle counts in their top-3 factors;
- transit accessibility – distance to metro stations, bus stops, major roads;
- co-tenancy – proximity to complementary businesses (bridal boutiques, fashion stores, shopping malls).
A ranking with 70% coverage is sufficient for rough filtering (18 → 8 districts), but not sufficient for a final decision. Stage 3 (field checks) is mandatory to compensate for unavailable factors.
6. District Selection Recommendations
- base filter:
R_district ≥ 2.0 - practical approach: take the top-N districts (e.g., 5–8) by
R_districtand proceed to stage 2.
Stage 2 – MU Ranking Within Selected Districts (MU Level)
At this stage, municipal units are evaluated only within districts that passed stage 1.
2.1. Competitive Pressure at the MU Level (Primary Metric)
Formula (with zero smoothing):
mo_people_per_workshop = mo_population / (mo_workshops_working + 0.5)
Data source:
mo_density_stats.csv
Recommended market size filter:
- MUs with
mo_population < 30,000are treated as low priority (included in the shortlist only with strong local justification).
MU ranking scenario (without double-counting):
- primary sort by
mo_people_per_workshop(descending — higher value means lower competition / better opportunity), - secondary sort by
mo_population(larger market is preferred).
2.2. “White Spots” (Priority Flag — Not a Separate Weight)
This indicator highlights MUs with large populations and no competitors, without duplicating the primary metric 2.1.
Flag rule:
mo_workshops_working = 0andmo_population > 50,000→ high validation prioritymo_workshops_working ∈ {0,1}and30,000 ≤ mo_population ≤ 50,000→ medium priority- otherwise → normal mode
Important:
The flag means “examine more closely,” not “automatically select this MU.” For flagged MUs, manual competitor verification (map search/directories) is required, because zeros often result from incomplete classification or boundary issues.
Stage 3 – Field and Qualitative Checks (Shortlist)
Stage 3 is applied to the top MUs/locations after stage 2 and is not part of the automated CSV scoring, as no reliable data sources are yet available.
Example factors:
- safety / crime environment,
- street commercial activity (tenants, foot traffic),
- parking availability and vehicle access,
- quality of the specific premises (street-facing, 15–40 m², storefront window, utilities),
- rent and lease terms.
Stage 3 outcome: a shortlist of 3–5 specific addresses for financial modeling.
Stage 4 – Financial Modeling
A financial model is built for shortlisted locations using consistent assumptions (CAPEX/OPEX, rent, traffic, conversion rate, average ticket, seasonality), and the option with the best profile is selected:
- expected profit,
- payback period,
- sensitivity to rent/traffic.
Appendix A – Criteria Summary by Stage
Stage 1 (Districts): criteria computed from current CSVs
- 1.1 Competitor density – 20%
- 1.2 Survival rate – 5%
- 2.1 Target audience 30–65 – 15%
- 2.2 Women of working age – 5%
- 2.3 Population growth 2018–2025 – 10%
- 3.1 Average salary – 10%
- 4.3 New housing commissioned – 5%
Sum of available weights: 70%.
Criteria not computed (data unavailable): 3.2, 4.1, 4.2.
Stage 2 (MUs): criteria and rules
- 2.1 MU competitor density (primary numerical metric)
- 2.2 “White spots” – validation priority flag (no separate weight)