Building a Scalable SMS Campaign Sender with Fastify, Node.js, and Sinch

This guide provides a complete walkthrough for building a robust Node.js application using the Fastify framework to send bulk SMS marketing campaigns via the Sinch SMS REST API. We will cover everything from project setup and core functionality to deployment and monitoring, enabling you to create a production-ready service.

By the end of this tutorial, you will have a functional API endpoint that accepts a list of phone numbers and a message, then uses Sinch to dispatch the SMS messages efficiently. This solves the common need for businesses to programmatically send targeted SMS campaigns for marketing, notifications, or alerts.

Project Overview and Goals

Goal: Create a Node.js backend service that exposes an API endpoint to send SMS messages to a list of recipients using the Sinch SMS API.
Technology:
- Node.js: The JavaScript runtime environment.
- Fastify: A high-performance, low-overhead web framework for Node.js, chosen for its speed, extensibility, and developer-friendly features like built-in logging and schema validation.
- Sinch SMS REST API: The third-party service used to send SMS messages. We'll use its /batches endpoint for sending messages.
- Axios: A promise-based HTTP client for making requests to the Sinch API.
- Dotenv: To manage environment variables securely.
- Prisma: (Optional but recommended) An ORM for database interaction to log campaign details.
- Docker: For containerizing the application for deployment.
Outcome: A REST API endpoint (POST /campaigns) that accepts a JSON payload containing recipients (an array of phone numbers) and a message (string), sends the SMS via Sinch, logs the attempt, and returns a confirmation.
Prerequisites:
- Node.js and npm (or yarn) installed.
- A Sinch account with SMS API credentials (Service Plan ID, API Token).
- A provisioned phone number within your Sinch account.
- Basic familiarity with Node.js, REST APIs, and terminal commands.
- (Optional) Docker installed for containerization.
- (Optional) A database (e.g., PostgreSQL, MySQL) for campaign logging with Prisma.

System Architecture

graph LR
    A[Client / API Caller] -- HTTP POST /campaigns --&gt; B(Fastify App);
    B -- Send SMS Request --&gt; C(Sinch SMS REST API);
    C -- SMS Delivery --&gt; D(Recipient Phones);
    B -- Log Campaign Data --&gt; E[(Database)];
    C -- Response --&gt; B;
    B -- API Response --&gt; A;

    style B fill:#f9f,stroke:#333,stroke-width:2px
    style C fill:#ccf,stroke:#333,stroke-width:2px
    style E fill:#eee,stroke:#333,stroke-width:1px,stroke-dasharray: 5 5

(Note: Verify Mermaid diagram rendering on your publishing platform)

1. Setting up the Project

Let's initialize our Node.js project and install the necessary dependencies.

Create Project Directory: Open your terminal and create a directory for the project, then navigate into it.
```
mkdir sinch-fastify-campaigns
cd sinch-fastify-campaigns
```
Initialize Node.js Project: This creates a package.json file to manage dependencies and project metadata. The -y flag accepts default settings.
```
npm init -y
```
Install Dependencies: We need Fastify for the web server, Axios for HTTP requests, and Dotenv for environment variables.
```
npm install fastify axios dotenv
```
Install Development Dependencies (Optional - Prisma): If you plan to log campaigns to a database, install Prisma.
```
npm install prisma @prisma/client --save-dev
```
Initialize Prisma (Optional): This creates a prisma directory with a schema.prisma file and a .env file (if one doesn't exist). Choose your database provider when prompted.
```
npx prisma init
```
- Configuration: Update the DATABASE_URL in the generated .env file with your actual database connection string.
Create Project Structure: Set up a basic structure for clarity.
```
mkdir src
touch src/server.js src/sinchService.js .env .gitignore
```
- src/server.js: Main application file containing Fastify setup and routes.
- src/sinchService.js: Module for interacting with the Sinch API.
- .env: Stores sensitive credentials (API keys, database URL). Never commit this file.
- .gitignore: Specifies intentionally untracked files that Git should ignore.

Configure .gitignore: Add common Node.js ignores and the .env file.

# .gitignore

node_modules
.env
npm-debug.log*
yarn-debug.log*
yarn-error.log*
dist
coverage
.DS_Store

Configure .env: Add placeholders for your Sinch credentials and database URL. You will obtain these values later from your Sinch dashboard and database provider.

# .env

# Sinch Credentials
SINCH_SERVICE_PLAN_ID=YOUR_SERVICE_PLAN_ID
SINCH_API_TOKEN=YOUR_API_TOKEN
SINCH_NUMBER=YOUR_SINCH_PHONE_NUMBER # e.g., +15551234567
SINCH_REGION_URL=https://us.sms.api.sinch.com # Or eu., ca., au., br., etc.

# Server Configuration
PORT=3000

# Database (Optional - Adjust based on your provider)
# Example for PostgreSQL
DATABASE_URL=""postgresql://user:password@host:port/database?schema=public""

Purpose: Using .env keeps sensitive data out of your codebase, enhancing security. dotenv loads these variables into process.env.

2. Implementing Core Functionality (Sinch Service)

We'll encapsulate the logic for sending SMS messages via Sinch in a dedicated service module.

Edit src/sinchService.js: Create a function to handle the API call to Sinch's /batches endpoint.

// src/sinchService.js
import axios from 'axios';

// Load Sinch credentials securely from environment variables
const SERVICE_PLAN_ID = process.env.SINCH_SERVICE_PLAN_ID;
const API_TOKEN = process.env.SINCH_API_TOKEN;
const SINCH_NUMBER = process.env.SINCH_NUMBER;
const SINCH_API_BASE_URL = process.env.SINCH_REGION_URL || 'https://us.sms.api.sinch.com'; // Default to US region

/**
 * Sends an SMS campaign batch using the Sinch REST API.
 * @param {string[]} recipients - An array of phone numbers in E.164 format (e.g., +15551234567).
 * @param {string} message - The text message body.
 * @returns {Promise<object&gt;} - The response data from the Sinch API.
 * @throws {Error} - Throws an error if the API call fails.
 */
async function sendSmsBatch(recipients, message) {
  if (!SERVICE_PLAN_ID || !API_TOKEN || !SINCH_NUMBER) {
    throw new Error('Sinch API credentials are not configured in .env file.');
  }
  if (!recipients || recipients.length === 0) {
    throw new Error('Recipient list cannot be empty.');
  }
  if (!message) {
    throw new Error('Message body cannot be empty.');
  }

  const endpoint = `${SINCH_API_BASE_URL}/xms/v1/${SERVICE_PLAN_ID}/batches`;

  // The 'to' field expects an array of recipient phone numbers (strings)
  const payload = {
    from: SINCH_NUMBER,
    to: recipients,
    body: message,
    // Optional parameters can be added here, e.g., delivery_report: 'full'
  };

  const config = {
    headers: {
      'Authorization': `Bearer ${API_TOKEN}`,
      'Content-Type': 'application/json',
    },
  };

  try {
    // Consider using a passed-in logger instance instead of console.log in production
    console.log(`Sending SMS batch to ${recipients.length} recipients via Sinch...`);
    const response = await axios.post(endpoint, payload, config);
    console.log('Sinch API response:', response.data);
    return response.data; // Contains batch_id, etc.
  } catch (error) {
    console.error('Error sending SMS via Sinch:', error.response?.data || error.message);
    // Re-throw a more specific error or handle it based on status code
    throw new Error(`Sinch API request failed: ${error.response?.data?.text || error.message}`);
  }
}

export { sendSmsBatch };

Why this approach?
- Modularity: Separates Sinch interaction logic from the main server code.
- Security: Loads credentials from environment variables, not hardcoded.
- Error Handling: Includes basic validation and catches errors from the axios request.
- Clarity: Uses the documented Sinch /batches endpoint structure, explicitly noting the to field requires an array.

3. Building the API Layer with Fastify

Now, let's set up the Fastify server and define the API endpoint to trigger the SMS sending.

Edit src/server.js: Configure Fastify, load environment variables, define the route, and start the server.

// src/server.js
import Fastify from 'fastify';
import dotenv from 'dotenv';
import { sendSmsBatch } from './sinchService.js';
// Optional: Import Prisma client if logging campaigns
// import { PrismaClient } from '@prisma/client';

// Optional: Monitoring dependencies (install if needed: npm install prom-client)
// import promClient from 'prom-client';

// Load environment variables from .env file
dotenv.config();

// Optional: Initialize Prisma Client
// const prisma = new PrismaClient();

// Initialize Fastify with logging enabled
const fastify = Fastify({
  logger: true // Uses Pino logger - efficient and structured logging
});

// Optional: Setup Prometheus Metrics
/*
const register = new promClient.Registry();
promClient.collectDefaultMetrics({ register });
// Add custom metrics here (e.g., HTTP request duration)

fastify.get('/metrics', async (request, reply) =&gt; {
  reply.header('Content-Type', register.contentType);
  return register.metrics();
});
*/

// --- API Route Definition ---
const campaignSchema = {
  body: {
    type: 'object',
    required: ['recipients', 'message'],
    properties: {
      recipients: {
        type: 'array',
        items: {
          type: 'string',
          pattern: '^\\+[1-9]\\d{1,14}$' // E.164 format validation
        },
        minItems: 1,
      },
      message: {
        type: 'string',
        minLength: 1,
        maxLength: 1600 // Generous limit, Sinch handles concatenation
      }
    }
  },
  response: {
    200: {
      type: 'object',
      properties: {
        message: { type: 'string' },
        batchId: { type: 'string' },
        recipientCount: { type: 'number' }
      }
    },
    // Define other response schemas (e.g., 400, 500) as needed
  }
};

fastify.post('/campaigns', { schema: campaignSchema }, async (request, reply) =&gt; {
  const { recipients, message } = request.body;
  let campaignLogId = null;

  // **Important:** Implement suppression list check here before proceeding.
  // Filter out recipients who have opted out.
  // e.g., const activeRecipients = await filterSuppressedNumbers(recipients);
  // if (activeRecipients.length === 0) { /* Handle appropriately */ }

  try {
    // Optional: Log campaign attempt before sending
    /*
    if (prisma) {
      const campaignLog = await prisma.campaign.create({
        data: {
          message: message,
          recipientCount: recipients.length, // Or activeRecipients.length
          status: 'PENDING',
        },
      });
      campaignLogId = campaignLog.id;
      fastify.log.info(`Logged campaign attempt with ID: ${campaignLogId}`);
    }
    */

    // Call the Sinch service function with potentially filtered recipients
    // const sinchResponse = await sendSmsBatch(activeRecipients, message);
    const sinchResponse = await sendSmsBatch(recipients, message); // Use original list if filtering not yet implemented

    // Optional: Update campaign log status on success
    /*
    if (prisma && campaignLogId) {
      await prisma.campaign.update({
        where: { id: campaignLogId },
        data: { status: 'SENT', batchId: sinchResponse.id },
      });
      fastify.log.info(`Updated campaign log ${campaignLogId} to SENT`);
    }
    */

    // Send success response
    reply.code(200).send({
      message: 'SMS campaign batch submitted successfully.',
      batchId: sinchResponse.id, // Sinch returns a batch ID
      recipientCount: recipients.length // Or activeRecipients.length
    });

  } catch (error) {
    fastify.log.error(`Campaign sending failed: ${error.message}`);

    // Optional: Update campaign log status on failure
    /*
    if (prisma && campaignLogId) {
      await prisma.campaign.update({
        where: { id: campaignLogId },
        data: { status: 'FAILED', errorDetails: error.message },
      });
      fastify.log.error(`Updated campaign log ${campaignLogId} to FAILED`);
    }
    */

    // Send error response - Adjust status code based on error type if needed
    reply.code(500).send({ error: 'Failed to send SMS campaign.', details: error.message });
  }
});

// --- Health Check Route ---
fastify.get('/health', async (request, reply) =&gt; {
  // Add checks for database connectivity or other dependencies if needed
  return { status: 'ok', timestamp: new Date().toISOString() };
});

// --- Start Server ---
const start = async () =&gt; {
  try {
    const port = process.env.PORT || 3000;
    await fastify.listen({ port: parseInt(port, 10), host: '0.0.0.0' }); // Listen on all available network interfaces
    // fastify.log.info(`Server listening on port ${fastify.server.address().port}`); // Access port after listen resolves
  } catch (err) {
    fastify.log.error(err);
    // Optional: Disconnect Prisma before exiting
    // await prisma?.$disconnect();
    process.exit(1);
  }
};

start();

// Optional: Graceful shutdown handling
/*
const setupGracefulShutdown = (signal) =&gt; {
    process.on(signal, async () =&gt; {
        fastify.log.info(`Received ${signal}. Shutting down gracefully...`);
        await fastify.close();
        // await prisma?.$disconnect();
        fastify.log.info('Server closed.');
        process.exit(0);
    });
}
setupGracefulShutdown('SIGINT');
setupGracefulShutdown('SIGTERM');
*/

Why Fastify? Fastify's schema validation (schema: campaignSchema) automatically handles request body validation, improving security and reducing boilerplate code. Its logger (fastify.log) is highly performant.
Route Logic: The /campaigns route receives the request, validates it, (ideally) checks against a suppression list, calls the sinchService, handles potential errors, logs outcomes (optionally with Prisma), and sends back an appropriate response.
Health Check: The /health endpoint is crucial for monitoring and container orchestration.
Server Start: The start function initializes the server, listening on the configured port and host 0.0.0.0 (important for Docker).

4. Integrating with Sinch (Credentials Setup)

To connect to Sinch, you need your API credentials.

Navigate to Sinch Dashboard: Log in to your Sinch Customer Dashboard.
Find SMS API Credentials:
- Go to SMS -> APIs.
- You will find your Service plan ID listed here.
- Click on your Service Plan ID name.
- Under API Credentials, find your API token. You might need to click ""Show"" or generate one if it doesn't exist.
- Scroll down to the Numbers section to find the phone number(s) associated with this service plan. Choose one to use as the sender (from number). It must be in E.164 format (e.g., +15551234567).
- Note the Region mentioned on the API page (e.g., US, EU). The corresponding API base URL is needed. Common URLs:
  - US: https://us.sms.api.sinch.com
  - EU: https://eu.sms.api.sinch.com
  - Canada: https://ca.sms.api.sinch.com
  - Australia: https://au.sms.api.sinch.com
  - Brazil: https://br.sms.api.sinch.com

Update .env File: Paste the obtained values into your .env file:

# .env
SINCH_SERVICE_PLAN_ID=YOUR_ACTUAL_SERVICE_PLAN_ID
SINCH_API_TOKEN=YOUR_ACTUAL_API_TOKEN
SINCH_NUMBER=+1XXXXXXXXXX # Your actual Sinch number
SINCH_REGION_URL=https://<region&gt;.sms.api.sinch.com # Your actual region URL
# ... other variables

Security: Keep the .env file secure and ensure it's listed in your .gitignore.

5. Error Handling, Logging, and Retry Mechanisms

Our current setup includes basic error handling and logging.

Error Handling:
- The try...catch blocks in server.js and sinchService.js catch exceptions.
- Fastify's schema validation catches malformed requests before they hit the route handler.
- The sinchService throws errors for missing credentials or failed API calls.
- The API route returns a 500 status code on errors, providing a generic error message and logging specific details internally.
Logging:
- Fastify's built-in logger: true uses Pino for efficient, JSON-based logging. Logs include request details, errors, and informational messages (fastify.log.info, fastify.log.error).
- We explicitly log Sinch API errors and successful submissions.

Retry Mechanisms (Advanced):

For production systems sending critical messages, implementing retries with exponential backoff is recommended, especially for transient network errors or temporary Sinch API issues (e.g., 5xx errors).
Libraries like axios-retry or manual implementation using setTimeout can achieve this. This involves wrapping the axios.post call in sinchService.js within a retry loop.

Example Concept (Manual):

// Inside sinchService.js - Conceptual Retry Logic
// Consider passing a logger instance for better decoupling than using console directly
async function sendWithRetry(url, payload, config, logger = console, retries = 3, delay = 1000) {
    try {
        return await axios.post(url, payload, config);
    } catch (error) {
        // Only retry on specific errors (e.g., network or 5xx)
        if (retries &gt; 0 && (!error.response || error.response.status &gt;= 500)) {
            logger.warn(`Retrying Sinch request (${retries} left) after ${delay}ms delay...`);
            await new Promise(resolve =&gt; setTimeout(resolve, delay));
            // Pass the logger down in recursive calls
            return sendWithRetry(url, payload, config, logger, retries - 1, delay * 2); // Exponential backoff
        } else {
            throw error; // Max retries reached or non-retriable error
        }
    }
}
// In sendSmsBatch, replace the direct axios.post call with:
// return await sendWithRetry(endpoint, payload, config, console); // Pass appropriate logger

Testing Errors: Manually stop your network, provide invalid credentials in .env, or use tools like toxiproxy to simulate network failures between your app and the Sinch API.

6. Creating a Database Schema and Data Layer (Optional - Prisma)

If you initialized Prisma, let's define a schema to log campaign attempts.

Define Schema (prisma/schema.prisma): Add a model to store basic campaign information.

// prisma/schema.prisma
generator client {
  provider = ""prisma-client-js""
}

datasource db {
  provider = ""postgresql"" // Or your chosen provider: mysql, sqlite, sqlserver, mongodb
  url      = env(""DATABASE_URL"")
}

model Campaign {
  id             Int      @id @default(autoincrement())
  createdAt      DateTime @default(now())
  message        String
  recipientCount Int
  status         String   // e.g., PENDING, SENT, FAILED
  batchId        String?  // Sinch batch ID, nullable
  errorDetails   String?  // Store error message on failure, nullable

  // Optional: Add model for suppression list
  // @@map(""campaigns"") // Optional: Map model name to table name
}

// Optional: Model for Suppression List
/*
model SuppressionList {
    phoneNumber String   @id @unique // E.164 format
    reason      String?  // e.g., 'STOP', 'Complaint'
    createdAt   DateTime @default(now())
    updatedAt   DateTime @updatedAt

    // @@map(""suppression_list"")
}
*/

Create Database Migration: This command generates SQL migration files based on your schema changes and applies them to your database.
```
npx prisma migrate dev --name init_campaign_model
```
- Prisma will prompt you to name the migration (e.g., init_campaign_model) and then execute it against the database specified in your DATABASE_URL. If you added the suppression list model, run migrate again.
Generate Prisma Client: Ensure the Prisma client is generated/updated based on your schema.
```
npx prisma generate
```
Integrate with Server Code:
- Uncomment the Prisma-related lines in src/server.js (import, initialization, prisma.campaign.create, prisma.campaign.update).
- This adds database interaction to log campaign attempts and their final status (SENT/FAILED) along with the Sinch batchId or error details.
- If implementing suppression, you would query the SuppressionList model in the /campaigns handler before calling sendSmsBatch.

7. Adding Security Features

Security is paramount for any API.

Input Validation:
- Done: Fastify's schema validation in the POST /campaigns route already checks the request body structure, data types, and applies constraints (e.g., E.164 format for phone numbers, message length). This prevents many injection-style attacks and malformed data issues.

Rate Limiting:

Protect your API from abuse and brute-force attacks by limiting the number of requests a client can make.
Install the @fastify/rate-limit plugin:
```
npm install @fastify/rate-limit
```

// src/server.js
// ... other imports
import rateLimit from '@fastify/rate-limit';

// ... Initialize Fastify instance ...

// Register plugins *before* routes
await fastify.register(rateLimit, {
  max: 100, // Max requests per window per IP
  timeWindow: '1 minute' // Time window
  // Optional: keyGenerator, allowList, errorResponseBuilder etc.
});

// ... rest of the server code including route definitions

Adjust max and timeWindow based on expected usage and security requirements.

Secrets Management:
- Done: Using .env and dotenv keeps API keys and database URLs out of the code for local development.
- Production: In deployment environments, do not commit .env files. Use the platform's secret management tools (e.g., Docker Secrets, Kubernetes Secrets, environment variables injected by the PaaS).
HTTPS:
- Always run your API over HTTPS in production. This is typically handled by a reverse proxy (like Nginx, Caddy) or the hosting platform (PaaS) placed in front of your Node.js application.

Helmet (Optional but Recommended):

Use @fastify/helmet to set various security-related HTTP headers (like X-Frame-Options, Strict-Transport-Security).

npm install @fastify/helmet

// src/server.js
import helmet from '@fastify/helmet';
// ... Initialize Fastify instance ...

// Register plugins *before* routes
await fastify.register(helmet);
// ... rest of the server code

8. Handling Special Cases

Real-world SMS campaigns have nuances:

Large Recipient Lists:
- Sinch's /batches endpoint is designed for bulk sending, but check their documentation for maximum recipients per batch (often thousands).
- For very large lists (tens or hundreds of thousands), consider breaking them into smaller batches submitted sequentially or in parallel, respecting Sinch's API rate limits.
- Asynchronous Processing: For large batches that might take time, use a background job queue (e.g., BullMQ with Redis) to process the sending request asynchronously. The API would enqueue the job and return an immediate acknowledgment (202 Accepted) to the client.
Character Limits & Encoding:
- Standard GSM-7 encoding supports 160 characters per SMS segment. Longer messages are split (concatenated SMS).
- Using non-GSM characters (like emojis or specific symbols) switches to UCS-2 encoding, reducing the limit to 70 characters per segment.
- Sinch handles concatenation, but be mindful of costs as you're billed per segment. Inform users about potential multi-part messages.
Opt-Outs & Compliance:
- Crucial: Respect regulations like TCPA (US), GDPR (EU), etc. Only message users who have explicitly consented.
- Provide a clear opt-out mechanism (e.g., reply STOP). You will need to handle incoming SMS webhooks from Sinch to process opt-out keywords (like STOP).
- Maintain a suppression list (e.g., in your database, using the optional SuppressionList model shown earlier).
- Before sending any campaign, your API handler (POST /campaigns) must check the intended recipients against this suppression list and remove any opted-out numbers. This step is critical for compliance.
International Formatting:
- Done: Using E.164 format (+ followed by country code and number) is essential for international deliverability. The schema validation enforces this.
Delivery Reports (DLRs):
- Sinch can send webhooks to your application with delivery status updates (delivered, failed, etc.). You can configure a webhook URL in your Sinch dashboard (under your Service Plan settings).
- Create a corresponding Fastify route (e.g., POST /sinch/dlr) to receive these webhook events. Process the DLRs to update your campaign logs or database with the final delivery status for each message/batch. This requires exposing your application publicly (e.g., using ngrok for local development testing). Implementing this webhook handler is beyond the scope of this initial setup guide but is important for comprehensive tracking.

9. Implementing Performance Optimizations

While Fastify is inherently fast, consider these for high-load scenarios:

Asynchronous Processing (Queues): As mentioned for large lists, offloading the Sinch API calls to a background job queue (like BullMQ) prevents blocking the main API thread and improves response times for the client.
Database Connection Pooling: Prisma manages connection pooling automatically, which is generally efficient. Ensure your database server is adequately sized and configured.
Caching: Caching isn't typically a major factor for the sending part, but if you frequently look up suppression lists before sending, caching that list (e.g., using Redis or Memcached with appropriate invalidation) can speed things up significantly for large lists.
Node.js Clustering: For CPU-bound tasks (less common in I/O-bound apps like this, but possible) or to leverage multi-core processors effectively, use Node.js's built-in cluster module or a process manager like PM2 (pm2 start src/server.js -i max) to run multiple instances of your application behind a load balancer. Fastify works well in clustered environments.

Load Testing: Use tools like k6, autocannon, or wrk to simulate traffic and identify bottlenecks in your API endpoint, database interactions, or dependencies like the Sinch API.

# Example using autocannon (install with npm i -g autocannon)
# Ensure the server is running
autocannon -m POST -H ""Content-Type: application/json"" -b '{""recipients"":[""+15551234567""],""message"":""Test Load""}' http://localhost:3000/campaigns

Profiling: Use Node.js's built-in profiler (node --prof src/server.js then node --prof-process isolate-....log > processed.txt) or tools like Clinic.js (npm i -g clinic; clinic doctor -- node src/server.js) to analyze CPU usage, event loop delays, and memory allocation to pinpoint performance issues in your code.

10. Adding Monitoring, Observability, and Analytics

Knowing how your service behaves in production is crucial.

Health Checks:
- Done: The GET /health endpoint provides a basic liveness check. Enhance it to check database connectivity (prisma.$queryRaw or similar) or other critical dependencies.
Structured Logging:
- Done: Fastify's Pino logger outputs JSON, which is easily ingested by log aggregation systems (ELK Stack, Splunk, Datadog, Grafana Loki). Ensure logs capture relevant context (request IDs - Fastify adds reqId automatically, batch IDs, user IDs if applicable).
Performance Metrics (Prometheus Example):
- Integrate a metrics library like prom-client to expose application metrics (request latency, error rates, queue sizes, external API call duration) in Prometheus format.
- Set up Prometheus to scrape the /metrics endpoint and Grafana to visualize them on dashboards.
```
# Install Prometheus client library
npm install prom-client
```
- Add the Prometheus setup code to src/server.js (see the commented-out example in Section 3). You'll need to initialize prom-client and register the /metrics route. Add custom metrics to track specific application behavior (e.g., campaign submission rate, Sinch API latency).
Error Tracking:
- Use services like Sentry, Bugsnag, or Datadog APM to capture, aggregate, and alert on application errors in real-time. These often provide more context than just logs (e.g., stack traces, request context). Integrate their SDKs into your Fastify application.
Dashboards:
- Create Grafana (or similar) dashboards showing:
  - API request rate and latency (overall and per endpoint).
  - API error rates (4xx, 5xx).
  - Sinch API call latency and error rates (requires custom metrics).
  - Campaign processing throughput (campaigns submitted/sent per minute).
  - Database query performance (if applicable, using Prisma metrics or DB monitoring).
  - System resource usage (CPU, memory - often provided by the hosting platform or node_exporter).
Alerting:
- Configure alerts (e.g., in Prometheus Alertmanager, Datadog, Sentry) based on metrics and logs:
  - High API error rate (> 1%).
  - High API latency (> 500ms p95).
  - High Sinch API error rate.
  - /health endpoint failures.
  - High resource utilization.
  - Job queue failures or high latency (if using queues).

11. Troubleshooting and Caveats

Common issues you might encounter:

Sinch Errors:
- 401 Unauthorized: Incorrect SERVICE_PLAN_ID or API_TOKEN. Double-check .env and the Sinch dashboard. Ensure the token is for the SMS API, not other Sinch products. Verify the Authorization: Bearer <token> header format.
- 400 Bad Request: Invalid request format (check JSON structure against API docs), invalid from or to number format (must be E.164), message content issues, or other parameter problems. Check the error.response.data from Axios for specific details provided by Sinch.
- 403 Forbidden/Insufficient Funds: Your Sinch account may lack funds or permissions to send SMS to certain regions or using the specified from number. Check your account balance and settings.
- 5xx Server Error: Temporary issue on Sinch's side. Implement retries (see Section 5).
Configuration Issues:
- .env not loaded: Ensure dotenv.config() is called early in server.js. Verify the .env file is in the project root where node is executed.
- Incorrect DATABASE_URL: Check the format required by your database and Prisma. Ensure credentials and host are correct.
- Firewall Issues: Ensure your server can make outbound requests to the Sinch API endpoint (SINCH_REGION_URL) and your database (if applicable). If implementing DLR webhooks, ensure Sinch can reach your /sinch/dlr endpoint.
Code Errors:
- Typos in variable names (SERVICE_PLAN_ID, etc.).
- Incorrect async/await usage leading to unhandled promises.
- Schema validation errors (check the pattern for E.164, required fields).
Deployment Problems:
- Environment variables not set correctly in the production environment (use platform's secrets management).
- Incorrect host binding in fastify.listen (use 0.0.0.0 for Docker/containers).
- Port conflicts if the specified PORT is already in use.
Compliance/Opt-Out Failures:
- Critical: Forgetting to implement the suppression list check before sending can lead to legal issues and carrier filtering. Test this thoroughly. Ensure incoming STOP messages are processed correctly via webhooks.

Always check the detailed error messages logged by Fastify (fastify.log.error) and the response data from Axios/Sinch (error.response?.data) for specific clues when troubleshooting.