Database Migration Guide

This guide provides comprehensive information on handling database migrations for the Meta Agent Platform.

Introduction

Database migrations are essential for evolving the database schema as the application develops. The Meta Agent Platform uses a migration-based approach to manage schema changes across all environments (development, staging, and production) in a controlled, versioned manner.

Migration Philosophy

The platform follows these principles for database migrations:

Versioned: All schema changes are versioned and tracked in source control
Incremental: Changes are applied incrementally, building on previous versions
Automated: Migrations can be applied automatically during deployment
Reversible: Where possible, migrations include both "up" and "down" operations
Consistent: Migrations ensure consistent schema across all environments
Safe: Migrations are designed to minimize risk and avoid data loss

Migration Tools

The Meta Agent Platform uses the following tools for database migrations:

SQLAlchemy ORM

SQLAlchemy provides the Object-Relational Mapping (ORM) layer that defines the Python data models representing database tables and relationships.

Key components: - Models: Python classes that define database tables and their relationships - MetaData: Contains the table definitions and their structure - Engine: Connection to the database - Session: Unit of work for interacting with the database

Alembic

Alembic is a database migration tool for SQLAlchemy that manages schema changes over time.

Key components: - Revisions: Individual migration scripts containing schema changes - Version Control: Tracks which migrations have been applied - Migration Context: Provides information about the current state of migrations - Operations: Methods for schema changes (create_table, add_column, etc.)

Migration Workflow

Initial Setup

If you're setting up migrations for the first time:

Initialize Alembic:
```
cd backend
alembic init migrations
```
Configure Alembic:
Update alembic.ini with database connection string
Configure env.py to use application models

Create Initial Migration:

alembic revision --autogenerate -m "Initial schema"

Regular Migration Workflow

The standard workflow for implementing database changes:

Update Models:
Modify SQLAlchemy models to reflect desired schema changes
Ensure model changes align with business requirements

Generate Migration Script:

alembic revision --autogenerate -m "Description of changes"

Review Migration Script:
Alembic auto-generates a migration script in migrations/versions/
Review the script to ensure it accurately captures intended changes
Modify if necessary (add data migrations, fix complex changes)
Test Migration:
Apply migration to development database:
```
alembic upgrade head
```
Verify changes work as expected
Test downgrade if needed:
```
alembic downgrade -1
```
Commit Migration:
Commit both model changes and migration script to version control
Deploy:
During deployment, run alembic upgrade head to apply pending migrations

Migration Command Reference

Creating Migrations

# Auto-generate migration based on model changes
alembic revision --autogenerate -m "Description"

# Create empty migration script
alembic revision -m "Description"

Applying Migrations

# Upgrade to latest version
alembic upgrade head

# Upgrade to specific version
alembic upgrade <revision_id>

# Upgrade by relative steps
alembic upgrade +2

# Downgrade to specific version
alembic downgrade <revision_id>

# Downgrade by relative steps
alembic downgrade -1

# Downgrade to base (before any migrations)
alembic downgrade base

Information Commands

# Show current migration version
alembic current

# Show migration history
alembic history

# Show pending migrations
alembic history --indicate-current

# Show SQL that would be executed (without running it)
alembic upgrade head --sql

Best Practices

Writing Migration Scripts

Keep Migrations Focused:
Each migration should handle a specific, coherent change
Avoid combining unrelated changes in a single migration
Make Migrations Reversible:
Implement both upgrade() and downgrade() methods
Test downgrades to ensure they work correctly
Handle Data Migrations Carefully:
Add explicit data transformation steps when changing column types
Consider performance for large tables
Comment Complex Operations:
Add comments explaining the reasoning behind complex changes
Document any assumptions or prerequisites
Avoid Raw SQL When Possible:
Use Alembic operations instead of raw SQL for better database agnosticism
When raw SQL is necessary, clearly comment and consider database-specific variants

Migration Workflow Practices

Regular Small Migrations:
Prefer frequent, small migrations over infrequent, large ones
Easier to review, test, and troubleshoot
Test on a Copy:
Test migrations on a copy of production data before applying to production
Identify potential performance or data issues
Maintain Backward Compatibility:
When possible, maintain code compatibility with both old and new schema during transitions
Allows for safer rollbacks if issues are discovered after deployment
Version Control:
Always commit migration scripts to version control
Keep migrations in sync with corresponding model changes
CI/CD Integration:
Run migrations as part of CI/CD pipeline
Include migration tests in automated testing

Multi-Tenancy Considerations

The Meta Agent Platform uses a multi-tenant database architecture, which requires special consideration for migrations.

Row-Level Security

Policy Creation:
Include RLS policy creation in migrations for new tables

Example:

def upgrade():
    # Create table
    op.create_table('my_table', ...)

    # Add RLS policy
    op.execute("""
        ALTER TABLE my_table ENABLE ROW LEVEL SECURITY;
        CREATE POLICY tenant_isolation_policy ON my_table
           USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
    """)

Policy Updates:
When changing table structure, ensure RLS policies are updated accordingly

Tenant-Specific Migrations

Global vs. Tenant-Specific:
Determine if a migration affects all tenants or is tenant-specific
For tenant-specific migrations, consider using a migration wrapper that iterates through tenants
Migration Function:

Example for tenant-specific data migration:

def upgrade():
    # Schema changes first (global)
    op.add_column('table', sa.Column('new_column', sa.String(), nullable=True))

    # Tenant-specific data migration
    connection = op.get_bind()
    tenants = connection.execute("SELECT id FROM tenants").fetchall()

    for tenant_id in tenants:
        connection.execute(
            f"SET app.current_tenant_id = '{tenant_id[0]}';"
            "UPDATE table SET new_column = 'default' WHERE new_column IS NULL;"
        )

Edge Deployment Considerations

The Meta Agent Platform supports edge deployments with SQLite, which requires special handling for migrations.

SQLite Limitations

Limited ALTER TABLE Support:
SQLite doesn't support all ALTER TABLE operations (like dropping columns)
Workaround: recreate tables with new schema
Migration Strategy:
Use conditional logic in migrations based on database type

Example:

def upgrade():
    if op.get_context().dialect.name == 'sqlite':
        # SQLite-specific approach
        with op.batch_alter_table('my_table') as batch_op:
            batch_op.add_column(sa.Column('new_column', sa.String()))
    else:
        # PostgreSQL approach
        op.add_column('my_table', sa.Column('new_column', sa.String()))

Edge Synchronization

Migration State Tracking:
Track migration state for each edge device
Sync migration state during connection to central system
Offline Migration:
Package relevant migrations with edge deployment
Apply migrations during edge runtime updates

Common Scenarios

Adding a New Table

def upgrade():
    op.create_table(
        'new_table',
        sa.Column('id', sa.UUID(), nullable=False),
        sa.Column('name', sa.String(), nullable=False),
        sa.Column('description', sa.Text(), nullable=True),
        sa.Column('created_at', sa.DateTime(), server_default=sa.text('now()'), nullable=False),
        sa.Column('tenant_id', sa.UUID(), nullable=False),
        sa.PrimaryKeyConstraint('id'),
        sa.ForeignKeyConstraint(['tenant_id'], ['tenants.id'], ondelete='CASCADE')
    )

    # Add indexes
    op.create_index('ix_new_table_tenant_id', 'new_table', ['tenant_id'])

    # Add RLS policy
    op.execute("""
        ALTER TABLE new_table ENABLE ROW LEVEL SECURITY;
        CREATE POLICY tenant_isolation_policy ON new_table
            USING (tenant_id = current_setting('app.current_tenant_id')::UUID);
    """)

def downgrade():
    op.drop_table('new_table')

Adding a Column

def upgrade():
    op.add_column('table_name', sa.Column('new_column', sa.String(50), nullable=True))

    # Set default values for existing rows
    op.execute("UPDATE table_name SET new_column = 'default'")

    # Make column non-nullable after setting defaults
    op.alter_column('table_name', 'new_column', nullable=False)

def downgrade():
    op.drop_column('table_name', 'new_column')

Changing Column Type

def upgrade():
    # For PostgreSQL
    op.alter_column('table_name', 'column_name',
                    type_=sa.String(100),
                    existing_type=sa.String(50),
                    postgresql_using='column_name::varchar(100)')

def downgrade():
    op.alter_column('table_name', 'column_name',
                    type_=sa.String(50),
                    existing_type=sa.String(100))

Data Migrations

def upgrade():
    # Add new column
    op.add_column('users', sa.Column('full_name', sa.String(255), nullable=True))

    # Migrate data
    op.execute("UPDATE users SET full_name = first_name || ' ' || last_name")

    # Make column non-nullable
    op.alter_column('users', 'full_name', nullable=False)

    # Optionally, remove old columns
    op.drop_column('users', 'first_name')
    op.drop_column('users', 'last_name')

def downgrade():
    # Restore old columns
    op.add_column('users', sa.Column('first_name', sa.String(100), nullable=True))
    op.add_column('users', sa.Column('last_name', sa.String(100), nullable=True))

    # Migrate data back
    op.execute("UPDATE users SET first_name = split_part(full_name, ' ', 1), last_name = split_part(full_name, ' ', 2)")

    # Drop new column
    op.drop_column('users', 'full_name')

Troubleshooting

Common Migration Issues

Migration Conflicts:
Symptom: Multiple developers create migrations that conflict
Solution: Rebase migrations, creating a new migration that combines changes
Failed Migrations:
Symptom: Migration fails partway through execution
Solution: Fix the issue, then either:
- Fix and rerun the migration
- Manually fix the database state, then mark migration as complete
- Restore from backup and retry
Missing Dependencies:
Symptom: Migration depends on a table/column created in another migration that hasn't run
Solution: Ensure migrations are applied in the correct order, check branch dependencies
SQLite Limitations:
Symptom: Migration works in PostgreSQL but fails in SQLite (edge deployment)
Solution: Use batch_alter_table or create database-specific migration paths

Fixing a Failed Migration

Identify the Issue:

# Check current migration state
alembic current

Options for Recovery:

# Option 1: Fix the migration script and retry
alembic upgrade head

# Option 2: Manually fix database and mark as complete
alembic stamp <revision_id>

# Option 3: Downgrade and retry
alembic downgrade <previous_working_revision>
# Fix the script
alembic upgrade head

Monitoring & Validation

Migration Validation

Pre-Migration Checks:
Validate that migration scripts can be generated without errors
Check for potential data consistency issues
Estimate migration duration for large tables
Post-Migration Validation:
Verify schema changes
Run basic application tests
Check data integrity
Monitor performance of affected queries

Migration Metrics

Track key metrics during migrations:

Execution Time:
How long migrations take to complete
Table Size Changes:
How migrations affect database size
Error Rate:
Number of failed migrations vs. successful ones
Application Impact:
Performance metrics before and after migration
Error rates during and after migration

Conclusion

Effective database migration management is crucial for the successful evolution of the Meta Agent Platform. By following the practices outlined in this guide, you can ensure smooth schema transitions while maintaining data integrity and application stability.

Remember that migrations are not just about changing the database schema—they're about safely evolving your data model while preserving existing data and functionality.