Mastering Streams in Node.js: Files, HTTP, and CSV Batching

Summary

Introduction
Why Streams Instead of Reading the Whole File?
Reading and Writing Files with Streams
Streaming HTTP Requests with Pipes
Processing a CSV File in Batches of 25 Rows
Why Streams Matter
Conclusion

Introduction

When working with large files or continuous data flows, many developers start with the simplest solution: reading the entire file into memory. This works for small files, but as soon as the file grows into hundreds of megabytes or even gigabytes, performance tanks and memory usage goes through the roof.

Node.js streams solve this by letting you process data chunk by chunk instead of all at once. Think of streams as a conveyor belt: you grab one piece, work on it, and move on, without ever needing the full dataset in memory.

In this article, we’ll explore streams with examples, including:

Why streams are better than loading a file into memory
Reading and writing files with streams
Streaming HTTP responses using pipes
Processing a CSV file in batches of 25 rows with pause/resume logic

Why Streams Instead of Reading the Whole File?

Consider the difference between using fs.readFile and a stream.

❌ Reading the Whole File at Once

import * as fs from "fs";

fs.readFile("large.txt", "utf8", (err, data) => {
  if (err) throw err;
  console.log("File length:", data.length);
});

This approach loads the entire file into memory as a string (or as a buffer if no encoding is given). For a 2 GB file, your process suddenly needs 2 GB of RAM. On machines with limited resources, this leads to memory exhaustion and crashes.

✅ Using Streams

import * as fs from "fs";

const readable = fs.createReadStream("large.txt", { encoding: "utf8" });

readable.on("data", (chunk) => {
  console.log("Received chunk of size:", chunk.length);
});

readable.on("end", () => {
  console.log("Stream reading complete!");
});

Here, the file is read in small chunks (default ~64 KB). Memory usage stays low, regardless of file size. Even a 10 GB file can be processed without running out of memory.

Example 1: Reading and Writing Files with Streams

Copying a file with streams avoids loading everything into RAM:

import * as fs from "fs";

const readable = fs.createReadStream("input.txt", { encoding: "utf8" });
const writable = fs.createWriteStream("output.txt");

readable.pipe(writable);

console.log("File copied successfully using streams!");

Example 2: Streaming HTTP Requests with Pipes

Streams also shine in network operations, such as downloading a file directly to disk:

import * as http from "http";
import * as fs from "fs";

const file = fs.createWriteStream("downloaded.html");

http.get("http://example.com", (response) => {
  response.pipe(file);

  response.on("end", () => {
    console.log("Download complete!");
  });

  response.on("error", (err) => {
    console.error("Error during HTTP request:", err);
  });
});

✅ No need to buffer the entire response in memory.

Example 3: Processing a CSV File in Batches of 25 Rows

For large CSV datasets, streams let us process data row by row and in controlled batches. Here’s how we can read a CSV, save every 25 rows to a database, pause while saving, and then resume:

import * as fs from "fs";
import csv from "csv-parser";

interface UserRow {
  id: string;
  name: string;
  email: string;
}

// Fake database insert function
async function saveBatchToDatabase(batch: UserRow[]): Promise<void> {
  console.log(`Saving batch of ${batch.length} rows to DB...`);
  await new Promise((resolve) => setTimeout(resolve, 500)); // simulate delay
}

async function processCSV(filePath: string): Promise<void> {
  return new Promise((resolve, reject) => {
    const results: UserRow[] = [];
    const stream = fs.createReadStream(filePath).pipe(csv());

    stream.on("data", async (row: UserRow) => {
      results.push(row);

      if (results.length === 25) {
        stream.pause(); // stop reading temporarily
        try {
          await saveBatchToDatabase(results.splice(0, 25));
          stream.resume(); // resume reading
        } catch (err) {
          stream.destroy(err as Error); // emit error
        }
      }
    });

    stream.on("end", async () => {
      if (results.length > 0) {
        try {
          await saveBatchToDatabase(results);
        } catch (err) {
          return reject(err);
        }
      }
      console.log("CSV processing complete!");
      resolve();
    });

    stream.on("error", (err) => {
      console.error("Stream error:", err);
      reject(err);
    });
  });
}

processCSV("users.csv").catch((err) => {
  console.error("Failed to process CSV:", err);
});

✅ Features:

Keeps memory stable by holding only 25 rows at a time.
Uses pause() and resume() to control flow.
Handles errors gracefully with stream.destroy().

Why Streams Matter

Streams are essential in Node.js for:

Efficiency → avoids loading entire files into memory
Scalability → can process multi-GB files without crashing
Flexibility → works with files, HTTP, sockets, and databases
Control → batching with pause/resume prevents overload

Conclusion

Whenever you’re working with files or large datasets in Node.js, leveraging streams is a proven path to success, ensuring your applications stay efficient, scalable, and reliable.

With the examples above, you now know how to:

Read and write files using streams
Pipe HTTP responses directly to disk
Process CSV files in batches with pause/resume control