Non-Profile-Enabled Datasets

24 Aug 2025 » Platform

When working with Adobe Experience Platform (AEP), sooner or later you will need to decide which datasets should be enabled for profile. It is not a good practice to enable all datasets for profile. In fact, doing so will likely get you into trouble. In this post, I will explain what non‑profile-enabled datasets are, why they are essential in a healthy AEP implementation, and some practical patterns.

What is a Non‑Profile-Enabled Dataset?

In AEP, data lands in the Data Lake after being ingested or generated. From there, you may choose to project certain datasets into the Real‑Time Customer Profile (RTCP) by enabling both the schema and the dataset for profile.

Any dataset not projected to RTCP is a non‑profile-enabled dataset. It still lives in the Data Lake and can be fully leveraged by many other tools, such as:

  • Customer Journey Analytics (CJA)
  • Data Distiller
  • External BI tools via the Postgres/BI connector

To clarify, most datasets—including those that are profile-enabled—reside in the Data Lake.

Why Keep Some Datasets Out of RTCP?

You know how much I like to start with the why. Here are several reasons to keep certain datasets only in the Data Lake:

  • Cost control. The profile store and identity graph are expensive. Enabling large datasets for profile will increase your license cost.
  • Performance. Loading data into RTCP is resource-intensive. The fewer datasets you enable for profile, the faster ingestion will be.
  • Data not used for activation. If the data will not be used in segmentation or personalization, it does not belong in RTCP.
  • Development and testing. You should never enable a schema or dataset for profile without thorough testing in the Data Lake.
  • Raw datasets. Sometimes a dataset contains too much information, when only part of it is needed for profile. In such cases, the raw dataset should remain outside RTCP and can later be refined into a smaller, profile-enabled dataset.

Core Use Cases

Use in Customer Journey Analytics

CJA data connections are established directly with the Data Lake, not RTCP.

It is common to duplicate data from a single upstream data source: one version, with only the attributes required for activation, is sent to a profile-enabled dataset; another version, with the attributes needed for reporting in CJA, is sent to a non-profile-enabled dataset.

Use in Data Distiller

A full post would be required to explain everything Data Distiller can do. If you have a license, you are probably already putting it to good use. Query Service (the service behind Data Distiller) works exclusively with the Data Lake.

This means that whenever data is intended for Data Distiller but not needed in RTCP, the source datasets should remain non-profile-enabled.

Typical scenarios include:

  • Derived datasets for profile. A large dataset is ingested into the Data Lake, but only certain attributes are needed in RTCP. A SQL query can extract those attributes and write them into a new dataset that is profile-enabled.
  • Derived datasets for export. Sometimes a custom dataset is required for export. By using the same procedure, you can then map the output dataset to a cloud destination. In this case, the output dataset will generally not be enabled for profile.
  • Audience creation. You can also create audiences with Data Distiller. This works similarly to the above examples, where a SQL query selects identities which are then used to build an audience.

For more insights, see the Adobe Summit 2025 session S656 - Top Tips to Maximize Value with Adobe Experience Platform Data Distiller.

Profile Snapshot

Every day, the contents of RTCP are copied into a dataset. You can identify this dataset by its name:

Profile-Snapshot-Export-<UUID>

Naturally, this dataset is not profile-enabled. Its common uses include:

  • serving as input to Data Distiller, or
  • exporting to a cloud destination.

 

Non-profile-enabled datasets are just as important as profile-enabled ones for keeping your AEP implementation balanced, cost-efficient, and high-performing.

Let me know in the comments if you have seen other interesting use cases for non-profile-enabled datasets.

 

Photo by Quang Nguyen Vinh



Related Posts