Files
Project_Velocity/.Agent Context/Sprint 1/Founder CRM and Platform Delivery Pack/16 - Coding Agent Swarm Brief_ Synthetic Client Graph Generation.md

4.6 KiB
Raw Blame History

Coding Agent Swarm Brief_ Synthetic Client Graph Generation

Date: 2026-04-18
Status: Draft
Owner: Sagnik
Reviewers: Sayan, Sourik
Scope: Generate 250 full synthetic client graphs for Project Velocity CRM and Client 360 validation
Purpose: Provide a decision-complete brief to a coding-agent swarm that will synthesize realistic client datasets aligned to the future founder CRM schema.
Decision Boundary: This brief defines the data generation target. It does not itself generate the data.

1. Mission

Generate 250 fully synthetic client graphs that can be imported into Project Velocity and used to validate the future CRM, import, Client 360, Oracle, QD, and reminder workflows.

These are not toy demo leads. They should behave like real premium real-estate clients and surrounding commercial context.

2. Target Domain Assumptions

The synthetic data must align to the planned canonical domains:

  • crm_*
  • intel_*
  • inventory_*
  • workflow_*

The dataset must feel like it was generated for an AI-native CRM rather than a flat spreadsheet app.

3. Geography and Inventory Pool

Every synthetic client graph should be interested in one or more of these Kolkata-area projects:

  • Eden Devprayag
  • Sugam Prakriti
  • Atri Aqua
  • Atri Surya Toron
  • Siddha Suburbia Bungalow
  • Merlin Avana
  • DTC Good Earth
  • Siddha Serena
  • Siddha Sky Waterfront
  • Godrej Blue
  • DTC Sojon
  • Shriram Grand City
  • Godrej Elevate
  • Ambuja Utpaala

4. Dataset Composition

Generate at least:

  • 250 synthetic people / primary clients
  • linked family or co-buyer structures where relevant
  • linked accounts/organizations where relevant
  • multiple lead and opportunity states
  • multiple interaction histories per client

5. Required Synthetic Output Classes

Identity and CRM records

  • person identity
  • contact details
  • demographic hints
  • account or employer context
  • household/co-buyer relationships
  • lead status and opportunity stage history

Commercial records

  • project interests
  • unit preferences
  • budget bands
  • urgency
  • financing posture
  • timeline to decision
  • objections and motivations

Communication records

  • WhatsApp messages
  • WhatsApp voice-call records
  • voice-call transcripts with speaker segmentation
  • email threads
  • meeting notes
  • reminders and task chains

Timeline and visit records

  • site visits
  • revisit intent
  • stage changes over time
  • follow-up loops

Intelligence and enrichment

  • QD score history
  • QD time-series shifts
  • intent and urgency summaries
  • inferred persona labels
  • risk flags
  • recommended next actions

Evidence placeholders

Generate metadata placeholders for:

  • CCTV references
  • number-plate events
  • room or perception events
  • media asset references

Do not attempt to generate real CCTV image/video payloads unless the swarm can do so well. Metadata placeholders are acceptable.

6. File Formats Required

The swarm should produce:

  • import-ready CSV files for major canonical entities
  • JSON sidecars for nested or graph-heavy artifacts
  • relationship maps where one flat CSV is insufficient
  • one README describing how the synthetic dataset is organized

7. Required Realism Rules

  • names, organizations, communication tone, and buying patterns must feel believable
  • communication history should reflect premium property sales behavior, not generic consumer retail behavior
  • stage transitions should make narrative sense
  • reminders and follow-up tasks should reflect actual sales cadence
  • transcripts should contain realistic but synthetic dialogue

8. Distribution Guidance

The 250 graphs should be varied across:

  • high-intent buyers
  • slow-burn investors
  • NRI buyers
  • family decision units
  • price-sensitive but aspirational prospects
  • brokers/referral chains
  • repeat visitors

9. Output Quality Checks

The swarm must ensure:

  • referential integrity across IDs
  • no impossible date ordering
  • no orphaned opportunities or interactions
  • no transcript without parent interaction/call references
  • every QD or enrichment artifact points back to a plausible source

10. Acceptance Criteria

  • 250 complete synthetic client graphs produced
  • all listed project names are represented
  • output spans CRM, interaction, opportunity, reminder, transcript, and enrichment layers
  • files are structured to support future CSV-first import testing
  • a human reviewer can inspect a client graph and believe it is coherent

11. Bottom Line

The swarms job is to generate the synthetic world Velocity needs in order to stop designing around empty CRM tables and start validating against realistic client intelligence data.