GenProfile

Name to Age Profiler

"What's in a name?" A lot, it turns out...

For organizations looking for a baseline understanding of their customers or constituents, Genprofile.app provides a comparative profile of the age makeup of their constituent base compared to a user-defined baseline population. Based on a fusion of data from the Social Security Administration for over 100,000 first-names combined with the latest age estimates from the Census Bureau by one-year age group, our profiling engine offers an ideal way for a small retail business or membership organization to benchmark representation by age segment within their local market as a starting point for market analysis and planning. Also suitable for national or regional level analyses, using larger geographic scopes.
_{(please note that this service is designed for US based persons only at this time)}

The cost of a basic profile is just $125, for profiles of up to 50,000 unique persons; just click the 'Get a Profile' button below and we will get started (if your database exceeds this threshold, please use the 'get more into' form below and put a note in the 'request' section, and we will get back to you with pricing for your scenario;)

Chevron Right

Great, let's get started....

This application is brought to you by 35-year analytics veteran and TidyAnalytics founder Joel Narducci. TidyAnalytics goal as a company is to develop apps and datasets that distill complex data for organizations of all stripes in an affordable and immediately useful way. Nothing is more fundamental to the human situation than age and lifestage.

Some details first about data security and privacy:

Any data you send is processed and stored within a private data enclave within the Microsoft Azure cloud, including private key-vault and storage accounts, with your data segregated from other client data at all times.

By default, we retain the source data you provide for a period of 30 days, unless you expressly direct us in writing otherwise; we retain the aggregate profile we generate for you indefinitely, for your convenience, unless you direct us otherwise.

There are a couple of requirements for the format of the data that we will summarize on the next page. Please note that more detailed information will be provided in our upload instructions via email.Genprofile.app is a profiling engine, and not a process designed to overlay age data onto your existing customer records. It is in no way a threat to the privacy of your customers, who are only analyzed in the aggregate.

Similarly, TidyAnalytics is not an information-brokerage and does not retain, broker, or sell individual level consumer information, and never will.

Because of this, the only data field strictly necessary to send is "first name".

To reiterate our privacy-first design:

Please note that our service is intended for initial exploratory insight, and not intended to overlay information onto your customer file; one thing we definitely do not want is for this tool to be used to target, include, or exclude specific individuals based on the results; there are enough companies in the world doing that already. As a result, we report back (and store beyond 30 days) only the aggregate profile. We do report back basic statistics on the match rate of your list against our database

Next...

You can send either individual records with or without customer identifiers, however, if your data does NOT represent one record per unique person, you will need to also include a count of records for that name, and, ideally, an additional field showing a volumetric measure for that name (total visits count, total revenue last 3 months, etc.).

Inclusion of an additional volumetric measure is meant to generate a statistically representative profile of the segments that are actually driving your business or organization, rather than simply assigning equal weight to whoever happens to be present in your database regardless of contribution.

If provided, we will use this extra valuation field as weighting factor for the aggregate profile and will generate two profiles for you: one based on simple count and a second based on the weighted percentages, the latter which will show you the relative contribution of each age cohort proportionately. You will receive only a single profile, based on simple name count, if you do not provide this additional valuation measure.

Person-Level - Weighted (Ideal):
- customer id
- first name
- value total
_{(visits, revenue, score, etc.)}

Person, Anonymous, Weighted:
- first name
- value total
_{(visits, revenue, score, etc.)}

Person, Anonymous:
- first name (unique person)

Aggregate Formats - acceptable:
- first name
- total unique persons

Aggregate Formats - ideal:
- first name
- total unique persons
- value total
_{(visits, revenue, score, etc.)}

What you get:We'll send you a time limited private link where you can see the your profile online, as well download it in PDF form along with the underlying data profile and comparison population measuresThe default comparison base for the age profile is the US population by 1 year age group; however, on the following form you can customize the geographic scope of your comparison baseline; if you have a sound understanding of the geographic scope of your audience, this is a really powerful feature that you will want to take advantage of. We support selection down to the county level currently, with support for smaller geographic selections coming soon (eg. 15 minute trade area)

Finally, please feel free to add any questions, or additional information we should know about your use case in the 'Request' area of the next form.

Let's Go!

Name to Age Profiler FAQ

Basic Profile - Free Until 10/1

Question:	Answer:
What is the source of the data?	The primary source for age baselines is the US Census Bureau, which in some of its tabulations and population estimates provide the count of total persons by single year of age. Names data comes from the Social Security Administration, which publishes name counts of persons born by year, for all years since the agency's inception. The current dataset consists of just over 100,000 unique first names. The SSA also generates actuarial data in the form of cohort life tables for the US population, which TidyAnalytics then uses to estimate the proportion of persons born each year surviving to 2024; this last step is crucial for determining the correct current representation expectations for each assigned name from a given year.
Are there any issues to be aware of?	You should definitely keep a couple of things in mind when using these profiles: The SSA data is limited to US born persons whose state of birth is known. In addition, the names supplied for any given year are limited to those used a minimum of five times in the given year, with names under this threshold suppressed due to SSA privacy protocols. The upshot: the initial iteration of the model may not cover a significant share of customers or members coming from diverse, non-US origin populations, who may have uncommon names falling under the coverage threshold, or, alternately, an age distribution that varies for a given covered name from that of the profile for US born persons. This is a difficult analytical problem, but we are working diligently on the next generation of the age model, and our goal is to assign all names, including those not found in the SSA data, a credible expected age distribution, based on other characteristics that are known or can be reasonably inferred. TidyAnalytics has a number of other global name datasets in-house that we are reviewing for potential use. We hope to release a refined version of the model sometime in 2026.
I see that the SSA reports its name data by gender, but I notice that you do not ask for or support inclusion of gender in your matching process. Why?	Setting aside the somewhat contentious nature of the topic these days, the answer to this question is actually a practical one: many, if not most, users of the profiling process -- businesses and organizations -- will not have gender recorded in their files, and thus inclusion of gender as a match requirement or supported dimension would reduce the general applicability and usefulness of the profiling tool. The upshot: ignoring gender means the age profiles likely suffer a bit in terms of precision (ie for names that may have a differential pattern of use between male and female through the years), however the process benefits greatly from the simplicity of using a model that collapses the gender counts into a single count for each name, and not having to deal with the issue of gender-ambiguity and gender-determination in a matching context or wide unavailability of the attribute in source data systems.