LogoSkills

ClickHouse Backend Rules

Rules to prevent bugs during ClickHouse backend development.

Rules to prevent bugs during ClickHouse backend development.

Type Safety Rules#

scalar() Method Type Casting#

Problem: The ClickHouse scalar<double>() method returns int when the value is 0.

// ❌ WRONG: Returns int(0) when value is 0 → type error
final avgTime = await client.scalar<double>(
  'SELECT avg(reading_time) FROM ...',
);  // Returns int(0) when result is 0 → double cast fails!

// ✅ CORRECT: Receive as num and convert with toDouble()
final avgTimeRaw = await client.scalar<num>(
  'SELECT avg(reading_time) FROM ...',
);
final avgTime = avgTimeRaw?.toDouble() ?? 0.0;
// See analytics_service.dart
double _toDouble(Object? value) {
  if (value == null) return 0;
  if (value is double) return value;
  if (value is int) return value.toDouble();
  if (value is num) return value.toDouble();
  return 0;
}

// Usage
final result = await client.scalar<num>('SELECT sum(amount) FROM ...');
final amount = _toDouble(result);

Row Data Processing#

// ❌ WRONG: Direct casting
final rows = await client.select('SELECT value FROM ...');
final value = rows.first['value'] as double;  // Could be int!

// ✅ CORRECT: Use helper function
final rows = await client.select('SELECT value FROM ...');
final value = _toDouble(rows.first['value']);

CDC Data Flow Rules#

Verify Data Storage and Query Table Match#

Problem: Data will not be visible if the storage table and query table are different.

// ❌ WRONG: Storage and query table mismatch
// EventTrackingService → Stores in ClickHouse `events` table
await eventService.track('screen_view', ...);

// AnalyticsService → Queries from ClickHouse `user_event` table
final result = await client.select('SELECT * FROM user_event');
// Result: Empty data! (different table)

// ✅ CORRECT: Use CDC flow
// PostgreSQL `user_event` → CDC → ClickHouse `user_event`
await UserEvent.db.insertRow(session, event);
final result = await client.select('SELECT * FROM user_event');

CDC vs Direct Storage Selection Criteria#

MethodWhen to UseAdvantagesDisadvantages
PostgreSQL CDC Transaction consistency needed, audit trail needed ACID, audit logs, backup Latency (1-5 seconds)
Direct ClickHouse High-frequency events, real-time analytics only Fast storage No transactions
// CDC method (recommended - most business events)
final event = UserEvent(
  userId: userId,
  eventName: 'screen_view',
  properties: jsonEncode(properties),
  timestamp: DateTime.now(),
);
await UserEvent.db.insertRow(session, event);

// Direct storage method (high-frequency events only)
await eventService.track('mouse_move', properties: {...});

ClickHouse Query Rules#

NULL Handling#

// ClickHouse treats NULL as 0 or empty string by default
// Explicit NULL handling recommended

// ❌ WRONG: No NULL handling
SELECT avg(value) FROM table  // Unexpected results when NULL is included

// ✅ CORRECT: Explicit NULL handling
SELECT avgIf(value, value IS NOT NULL) FROM table
SELECT coalesce(avg(value), 0) FROM table

Date/Time Handling#

// ClickHouse stores in UTC
// KST (Korea Standard Time) conversion required

// ✅ Timezone conversion
SELECT toDateTime(timestamp, 'Asia/Seoul') as local_time FROM ...

// ✅ Date grouping (Korea time-based)
SELECT toDate(timestamp, 'Asia/Seoul') as date, count() FROM ...
GROUP BY date

Checklist#

When implementing new analytics features:

  • Use scalar<num>() + apply _toDouble() helper
  • Verify storage table and query table match
  • Decide CDC flow vs direct storage
  • Specify NULL handling explicitly
  • Apply timezone (KST) conversion

  • backend/kobic_server/lib/src/feature/analytics/service/analytics_service.dart - _toDouble() helper pattern
  • backend/kobic_server/lib/src/feature/analytics/endpoint/screen_view_endpoint.dart - CDC method example
  • .claude/skills/clickhouse/SKILL.md - CDC table list