Skip to content

EcoLafaek Database Schema

AWS AI Agent Hackathon

Overview

Complete database schema for the EcoLafaek environmental waste monitoring system. This schema supports community waste reporting, AI-powered analysis with Amazon Titan Embed vector embeddings, user authentication, hotspot clustering, and analytics.

🧠 Key AI Features:

  • VECTOR(1024) embeddings for semantic similarity search via Amazon Titan Embed
  • AI analysis results with confidence scores and severity classification
  • Geospatial indexing for hotspot detection and mapping
  • Multi-application access from mobile app (Flutter), web dashboard (Next.js), and admin panel

For complete architecture, see Diagram/README.md.

Entity Relationship Diagram

Database Tables

Core Tables

1. users

User authentication and profiles for mobile app.

ColumnTypeDescription
user_idINT (PK)Auto-increment primary key
usernameVARCHAR(50)Unique username
emailVARCHAR(100)Unique email (optional)
phone_numberVARCHAR(20)Contact number
password_hashVARCHAR(255)Hashed password
registration_dateDATETIMEAccount creation timestamp
last_loginDATETIMELast login timestamp
account_statusENUMactive, inactive, suspended
profile_image_urlVARCHAR(255)Profile picture URL
verification_statusBOOLEANEmail verification status

2. reports

Waste reports submitted by users via mobile app.

ColumnTypeDescription
report_idINT (PK)Auto-increment primary key
user_idINT (FK)References users
latitudeDECIMAL(10,8)GPS latitude
longitudeDECIMAL(11,8)GPS longitude
location_idINT (FK)References locations
report_dateDATETIMESubmission timestamp
descriptionTEXTUser-provided description
statusENUMsubmitted, analyzing, analyzed, resolved, rejected
image_urlVARCHAR(255)AWS S3 image URL
device_infoJSONMobile device metadata
address_textVARCHAR(255)Reverse geocoded address

Indexes: (latitude, longitude), (status), (user_id), (status, report_date)

3. analysis_results

AI analysis results from Amazon Bedrock Nova-Pro with vector embeddings.

ColumnTypeDescription
analysis_idINT (PK)Auto-increment primary key
report_idINT (FK)References reports
analyzed_dateDATETIMEAnalysis timestamp
waste_type_idINT (FK)Primary waste type detected
confidence_scoreDECIMAL(5,2)AI confidence (0-100)
estimated_volumeDECIMAL(10,2)Estimated waste volume
severity_scoreINTSeverity rating
priority_levelENUMlow, medium, high, critical
analysis_notesTEXTShort AI analysis summary
full_descriptionTEXTComplete AI analysis
processed_byVARCHAR(50)AI model identifier
image_embeddingVECTOR(1024)Amazon Titan Embed image embedding
location_embeddingVECTOR(1024)Spatial vector embedding

Vector Embeddings: Generated by Amazon Titan Embed Image v1 for semantic similarity search.

4. waste_types

Waste classification categories used by AI.

ColumnTypeDescription
waste_type_idINT (PK)Auto-increment primary key
nameVARCHAR(50)Waste type name
descriptionTEXTWaste type description
hazard_levelENUMlow, medium, high
recyclableBOOLEANRecyclability flag
icon_urlVARCHAR(255)Icon image URL

Standard Types: Plastic, Paper, Glass, Metal, Organic, Electronic, Construction, Hazardous, Mixed

5. hotspots

Geographic clusters of waste accumulation.

ColumnTypeDescription
hotspot_idINT (PK)Auto-increment primary key
nameVARCHAR(100)Hotspot name
center_latitudeDECIMAL(10,8)Cluster center latitude
center_longitudeDECIMAL(11,8)Cluster center longitude
radius_metersINTCluster radius
location_idINT (FK)References locations
first_reportedDATEFirst report date
last_reportedDATEMost recent report
total_reportsINTNumber of reports in cluster
average_severityDECIMAL(5,2)Average severity score
statusENUMactive, monitoring, resolved
notesTEXTAdmin notes

Indexes: (center_latitude, center_longitude)

6. locations

Predefined locations/districts in Timor-Leste.

ColumnTypeDescription
location_idINT (PK)Auto-increment primary key
nameVARCHAR(100)Location name
districtVARCHAR(100)District name
sub_districtVARCHAR(100)Sub-district name
latitudeDECIMAL(10,8)Center latitude
longitudeDECIMAL(11,8)Center longitude
population_estimateINTPopulation count
area_sqkmDECIMAL(10,2)Area in square kilometers

Supporting Tables

7. report_waste_types

Many-to-many relationship for multiple waste types in a single report.

ColumnTypeDescription
idINT (PK)Auto-increment primary key
analysis_idINT (FK)References analysis_results
waste_type_idINT (FK)References waste_types
confidence_scoreDECIMAL(5,2)Type confidence score
percentageDECIMAL(5,2)Percentage composition

8. hotspot_reports

Links reports to hotspots (many-to-many).

ColumnTypeDescription
idINT (PK)Auto-increment primary key
hotspot_idINT (FK)References hotspots
report_idINT (FK)References reports

9. dashboard_statistics

Pre-calculated analytics for dashboard performance.

ColumnTypeDescription
stat_idINT (PK)Auto-increment primary key
stat_dateDATEStatistics date
location_idINT (FK)Location filter
waste_type_idINT (FK)Waste type filter
total_reportsINTTotal reports count
resolved_reportsINTResolved count
average_severityDECIMAL(5,2)Average severity
total_volumeDECIMAL(10,2)Total volume
trend_directionENUMincreasing, stable, decreasing
last_updatedDATETIMEUpdate timestamp

Indexes: (stat_date)

10. image_processing_queue

Queue for async image processing jobs.

ColumnTypeDescription
queue_idINT (PK)Auto-increment primary key
report_idINT (FK)References reports
image_urlVARCHAR(255)S3 image URL
statusENUMpending, processing, completed, failed
queued_atDATETIMEQueue timestamp
processed_atDATETIMEProcessing completion
retry_countINTRetry attempts
error_messageTEXTError details

Authentication Tables

11. user_verifications

Email/OTP verification for user registration.

ColumnTypeDescription
verification_idINT (PK)Auto-increment primary key
user_idINT (FK)References users
emailVARCHAR(100)Email to verify
otpVARCHAR(10)One-time password
created_atDATETIMEOTP creation time
expires_atDATETIMEOTP expiration
is_verifiedBOOLEANVerification status
attemptsINTVerification attempts

12. pending_registrations

Temporary storage for unverified registrations.

ColumnTypeDescription
registration_idINT (PK)Auto-increment primary key
usernameVARCHAR(50)Pending username
emailVARCHAR(100)Pending email
phone_numberVARCHAR(20)Phone number
password_hashVARCHAR(255)Hashed password
otpVARCHAR(10)Verification OTP
created_atDATETIMERegistration time
expires_atDATETIMEExpiration time
attemptsINTVerification attempts

13. api_keys

API keys for external integrations.

ColumnTypeDescription
key_idINT (PK)Auto-increment primary key
api_keyVARCHAR(255)API key string
nameVARCHAR(100)Key name/description
created_dateDATETIMECreation timestamp
expiration_dateDATETIMEExpiration date
activeBOOLEANActive status
permissionsJSONPermission scopes
last_usedDATETIMELast usage timestamp
created_byINT (FK)References users

Admin Panel Tables

14. admin_users

Admin panel user accounts (local only).

ColumnTypeDescription
admin_idINT (PK)Auto-increment primary key
usernameVARCHAR(50)Admin username (unique)
emailVARCHAR(100)Admin email (unique)
password_hashVARCHAR(255)Hashed password
roleENUMsuper_admin, admin, moderator
created_atDATETIMEAccount creation
last_loginDATETIMELast login timestamp
activeBOOLEANActive status

Indexes: (username), (email)

15. system_logs

System activity and audit logs.

ColumnTypeDescription
log_idINT (PK)Auto-increment primary key
timestampDATETIMELog timestamp
agentVARCHAR(50)Agent/service name
actionVARCHAR(100)Action performed
detailsTEXTAction details
log_levelENUMinfo, warning, error, critical
related_idINTRelated entity ID
related_tableVARCHAR(50)Related table name

16. system_settings

Application configuration settings.

ColumnTypeDescription
setting_idINT (PK)Auto-increment primary key
setting_keyVARCHAR(100)Setting key (unique)
setting_valueTEXTSetting value
data_typeENUMstring, number, boolean
descriptionTEXTSetting description
created_atDATETIMECreation timestamp
updated_atDATETIMELast update timestamp

Indexes: (setting_key)

17. notification_templates

Email/SMS notification templates.

ColumnTypeDescription
template_idINT (PK)Auto-increment primary key
nameVARCHAR(100)Template name
subjectVARCHAR(255)Email subject
bodyTEXTTemplate body
typeENUMemail, sms, push
created_atDATETIMECreation timestamp
updated_atDATETIMEUpdate timestamp

Database Configuration

Connection Example (Node.js):

javascript
import mysql from "mysql2/promise";

const pool = mysql.createPool({
  host: process.env.DB_HOST,
  port: process.env.DB_PORT || 4000,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  database: process.env.DB_NAME,
  connectionLimit: 20,
  queueLimit: 0,
  waitForConnections: true,
});

Connection Example (Python):

python
import mysql.connector
from dbutils.pooled_db import PooledDB

pool = PooledDB(
    creator=mysql.connector,
    maxconnections=20,
    mincached=2,
    host=os.getenv('DB_HOST'),
    port=int(os.getenv('DB_PORT', 4000)),
    user=os.getenv('DB_USER'),
    password=os.getenv('DB_PASSWORD'),
    database=os.getenv('DB_NAME')
)

Vector Search Operations

The analysis_results table contains two VECTOR(1024) columns for semantic similarity search.

Find Similar Reports by Image:

sql
-- Using Amazon Titan Embed image embeddings
SELECT
  r.report_id,
  r.description,
  r.image_url,
  ar.confidence_score,
  VEC_COSINE_DISTANCE(ar.image_embedding, :query_vector) as similarity
FROM reports r
JOIN analysis_results ar ON r.report_id = ar.report_id
WHERE ar.image_embedding IS NOT NULL
ORDER BY similarity ASC
LIMIT 10;

Find Nearby Reports by Location:

sql
-- Using location embeddings
SELECT
  r.report_id,
  r.latitude,
  r.longitude,
  ar.severity_score,
  VEC_COSINE_DISTANCE(ar.location_embedding, :query_location_vector) as distance
FROM reports r
JOIN analysis_results ar ON r.report_id = ar.report_id
WHERE ar.location_embedding IS NOT NULL
ORDER BY distance ASC
LIMIT 10;

Setup Instructions

1. Create Database

sql
CREATE DATABASE db_ecolafaek;
USE db_ecolafaek;

2. Run Schema Migration

bash
# From database/ directory
mysql -u your_user -p db_ecolafaek < schema.sql

Or use individual table schemas from all_schema/ folder.

3. Verify Vector Support

Ensure your database supports VECTOR data type (TiDB, MySQL 8.0.30+, or compatible):

sql
SHOW CREATE TABLE analysis_results;
-- Should show: `image_embedding` vector(1024) DEFAULT NULL

4. Configure Application Connection

See connection examples above for Node.js and Python.

5. Seed Initial Data (Optional)

Insert waste types, locations, and admin users as needed.

Performance Indexes

All performance-critical indexes are included in schema.sql:

  • idx_reports_location - Geospatial queries
  • idx_reports_status_date - Status filtering with date range
  • idx_analysis_results_date - Time-series analysis
  • idx_hotspots_location - Hotspot clustering
  • idx_dashboard_stats_date - Dashboard analytics

Security Best Practices

  • ✅ Never commit database credentials to version control
  • ✅ Use environment variables for all sensitive configuration
  • ✅ Enable SSL/TLS for database connections in production
  • ✅ Implement proper user access controls and roles
  • ✅ Regular automated backups recommended
  • ✅ Use prepared statements to prevent SQL injection
  • ✅ Limit admin panel access to localhost only

For complete system architecture, see Diagram/README.md.


📞 Additional Documentation

AWS AI Agent Global Hackathon

Built with ❤️ for Timor-Leste | AWS AI Agent Global Hackathon