Node.js, MongoDB And Angular Web Development: The Definitive Guide To Using MEAN Stack Build Applications (Developer' Brad Dayley & Brendan Caleb Mongo DB Development
User Manual:
Open the PDF directly: View PDF . Page Count: 1056
");
}
Converting an Array into a String
A useful feature of Array objects is the ability to combine the elements of a string
together to make a String object separated by a specific separator using the
join() method. For example, the following code results in the time components
being joined back together into the format 12:10:36:
var timeArr = [12,10,36];
var timeStr = timeArray.join(":");
Checking Whether an Array Contains an Item
Often you need to check to see whether an array contains a certain item. This can be
done by using the indexOf() method. If the item is not found in the list, a -1 is
returned. The following function writes a message to the console if an item is in the
week array:
Click here to view code image
function message(day){
var week = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"];
if (week.indexOf(day) == -1){
console.log("Happy " + day);
}
}
Adding and Removing Items to Arrays
There are several methods to add and remove items from Array objects using the
various built-in methods. Table 2.8 shows you the various methods used in this
book.
Table 2.8 Array object methods used to add and remove elements from arrays
Statement
var arr = [1,2,3,4,5];
Value of x
undefined
Value of arr
1,2,3,4,5
var x = 0;
0
1,2,3,4,5
x = arr.unshift("zero");
6 (length)
zero,1,2,3,4,5
x = arr.push(6,7,8);
9 (length)
zero,1,2,3,4,5,6,7,8
x = arr.shift();
zero
1,2,3,4,5,6,7,8
x = arr.pop();
8
1,2,3,4,5,6,7
x=arr.splice(3,3,"four",
"five","six");
4,5,6
1,2,3,four,five,six,7
x = arr.splice(3,1);
four
1,2,3,five,six,7
x = arr.splice(3);
five,six,7
1,2,3
Adding Error Handling
An important part of JavaScript coding is adding error handling for instances where
there may be problems. By default, if a code exception occurs because of a problem
in your JavaScript, the script fails and does not finish loading. This is not usually the
desired behavior; in fact, it is often catastrophic. To prevent these types of problems,
wrap your code in a try/catch block.
try/catch Blocks
To prevent your code from totally bombing out, use try/catch blocks that can
handle problems inside your code. If JavaScript encounters an error when executing
code in a try block, it jumps down and executes the catch portion instead of
stopping the entire script. If no error occurs, then all of the try block is executed
and none of the catch block.
For example, the following try/catch block tries to assign variable x to a value of
an undefined variable named badVarNam.
Click here to view code image
try{
var x = badVarName;
} catch (err){
console.log(err.name + ': "' + err.message +
}
'" occurred when assigning x.
Notice that the catch statement accepts an err parameter, which is an error object.
The error object provides the message property that provides a description of the
error. The error object also provides a name property, which is the name of the error
type that was thrown.
The preceding code results in an exception and writes the following message:
ReferenceError: "badVarName is not defined occurred when assign
Throw Your Own Errors
You can also throw your own errors using a throw statement. The following code
illustrates how to add throw statements to a function to throw an error even if a
script error does not occur. The function sqrRoot() accepts a single argument x.
It then tests x to verify that it is a positive number and returns a string with the
square root of x. If x is not a positive number, then the appropriate error is thrown
and the catch block returns the error:
Click here to view code image
function sqrRoot(x) {
try {
if(x=="")
throw {message:"Can't Square Root Nothing"};
if(isNaN(x)) throw {message:"Can't Square Root Strings"};
if(x<0)
throw {message:"Sorry No Imagination"};
return "sqrt("+x+") = " + Math.sqrt(x);
} catch(err){
return err.message;
}
}
function writeIt(){
console.log(sqrRoot("four"));
console.log(sqrRoot(""));
console.log(sqrRoot("4"));
console.log(sqrRoot("-4"));
}
writeIt();
The following is the console output showing the different errors that are thrown
based on input to the sqrRoot() function:
Can't Square Root Strings
Can't Square Root Nothing
sqrt(4) = 2
Sorry No Imagination
Using finally
Another valuable tool in exception handling is the finally keyword. A finally
keyword can be added to the end of a try/catch block. After the try/catch
blocks are executed, the finally keyword is always executed. It doesn’t matter if
an error occurs and is caught or if the try block is fully executed.
The following is an example of using a finally block inside a webpage:
Click here to view code image
function testTryCatch(value){
try {
if (value < 0){
throw "too small";
} else if (value > 10){
throw "too big";
}
your_code_here
catch (err) {
console.log("The number was " + err.message);
} finally {
console.log("This is always written.");
}
}
Summary
Understanding JavaScript is critical to working in the Node.js, MongoDB, Express,
and Angular environments. This chapter discussed enough of the basic JavaScript
language syntax for you to grasp the concepts in the rest of the book. The chapter
discussed creating objects and functions, as well as working with strings and arrays.
You also learned how to apply error handling to your scripts, which is critical in the
Node.js environment.
Next
In the next chapter, you jump right into the basics of setting up a Node.js project.
You also learn a few of the language idioms and a see simple practical example.
Part II: Learning Node.js
3
Getting Started with Node.js
This chapter introduces you to the Node.js environment. Node.js is a
website/application framework designed with high scalability in mind. It was
designed to take advantage of the existing JavaScript technology in the browser and
flow those same concepts all the way down through the webserver into the backend
services. Node.js is a great technology that is easy to implement and yet extremely
scalable.
Node.js is a modular platform, meaning that much of the functionality is provided by
external modules rather than being built in to the platform. The Node.js culture is
active in creating and publishing modules for almost every imaginable need.
Therefore, much of this chapter focuses on understanding and using the Node.js tools
to build, publish, and use your own Node.js modules in applications.
Understanding Node.js
Node.js was developed in 2009 by Ryan Dahl as an answer to the frustration caused
by concurrency issues, especially when dealing with web services. Google had just
come out with the V8 JavaScript engine for the Chrome web browser, which was
highly optimized for web traffic. Dahl created Node.js on top of V8 as a server-side
environment that matched the client-side environment in the browser.
The result is an extremely scalable server-side environment that allows developers to
more easily bridge the gap between client and server. The fact that Node.js is written
in JavaScript allows developers to easily navigate back and forth between client and
server code and even reuse code between the two environments.
Node.js has a great ecosystem with new extensions being written all the time. The
Node.js environment is clean and easy to install, configure, and deploy. Literally in
only an hour or two you can have a Node.js webserver up and running.
Who Uses Node.js?
Node.js quickly gained popularity among a wide variety of companies. These
companies use Node.js first and foremost for scalability but also for ease of
maintenance and faster development. The following are just a few of the companies
using the Node.js technology:
Yahoo!
LinkedIn
eBay
New York Times
Dow Jones
Microsoft
What Is Node.js Used For?
Node.js can be used for a wide variety of purposes. Because it is based on V8 and
has highly optimized code to handle HTTP traffic, the most common use is as a
webserver. However, Node.js can also be used for a variety of other web services
such as:
Web services APIs such as REST
Real-time multiplayer games
Backend web services such as cross-domain, server-side requests
Web-based applications
Multiclient communication such as IM
What Does Node.js Come With?
Node.js comes with many built-in modules available right out of the box. This book
covers many but not all of these modules:
Assertion testing: Allows you to test functionality within your code.
Buffer: Enables interaction with TCP streams and file system operations. (See
Chapter 5, “Handling Data I/O in Node.js.”)
C/C++ add-ons: Allows for C or C++ code to be used just like any other
Node.js module.
Child processes: Allows you to create child processes. (See Chapter 9,
“Scaling Applications Using Multiple Processors in Node.js.”)
Cluster: Enables the use of multicore systems. (See Chapter 9.)
Command line options: Gives you Node.js commands to use from a terminal.
Console: Gives the user a debugging console.
Crypto: Allows for the creation of custom encryption. (See Chapter 10, “Using
Additional Node.js Modules.”)
Debugger: Allows debugging of a Node.js file.
DNS: Allows connections to DNS servers. (See Chapter 10.)
Errors: Allows for the handling of errors.
Events: Enables the handling of asynchronous events. (See Chapter 4, “Using
Events, Listeners, Timers, and Callbacks in Node.js.”)
File system: Allows for file I/O with both synchronous and asynchronous
methods. (See Chapter 6, “Accessing the File System from Node.js.”)
Globals: Makes frequently used modules available without having to include
them first. (See Chapter 10.)
HTTP: Enables support for many HTTP features. (See Chapter 7,
“Implementing HTTP Services in Node.js.”)
HTTPS: Enables HTTP over the TLS/SSL. (See Chapter 7.)
Modules: Provides the module loading system for Node.js. (See Chapter 3.)
Net: Allows the creation of servers and clients. (See Chapter 8, “Implementing
Socket Services in Node.js.”)
OS: Allows access to the operating system that Node.js is running on. (See
Chapter 10.)
Path: Enables access to file and directory paths. (See Chapter 6.)
Process: Provides information and allows control over the current Node.js
process. (See Chapter 9.)
Query strings: Allows for parsing and formatting URL queries. (See Chapter
7.)
Readline: Enables an interface to read from a data stream. (See Chapter 5.)
REPL: Allows developers to create a command shell.
Stream: Provides an API to build objects with the stream interface. (See
Chapter 5.)
String decoder: Provides an API to decode buffer objects into strings. (See
Chapter 5.)
Timers: Allows for scheduling functions to be called in the future. (See Chapter
4.)
TLS/SSL: Implements TLS and SSL protocols. (See Chapter 8.)
URL: Enables URL resolution and parsing. (See Chapter 7.)
Utilities: Provides support for various apps and modules.
V8: Exposes APIs for the Node.js version of V8. (See Chapter 10.)
VM: Allows for a V8 virtual machine to run and compile code.
ZLIB: Enables compression using Gzip and Deflate/Inflate. (See Chapter 5.)
Installing Node.js
To easily install Node.js, download an installer from the Node.js website at
http://nodejs.org. The Node.js installer installs the necessary files on your PC to get
Node.js up and running. No additional configuration is necessary to start creating
Node.js applications.
Looking at the Node.js Install Location
If you look at the install location, you will see a couple of executable files and a
node_modules folder. The node executable file starts the Node.js JavaScript
VM. The following list describes the executables in the Node.js install location that
you need to get started:
node: This file starts a Node.js JavaScript VM. If you pass in a JavaScript file
location, Node.js executes that script. If no target JavaScript file is specified,
then a script prompt is shown that allows you to execute JavaScript code
directly from the console.
npm: This command is used to manage the Node.js packages discussed in the
next section.
node_modules: This folder contains the installed Node.js packages. These
packages act as libraries that extend the capabilities of Node.js.
Verify Node.js Executables
Take a minute and verify that Node.js is installed and working before moving on. To
do so, open a console prompt and execute the following command to bring up a
Node.js VM:
node
Next, at the Node.js prompt execute the following to write "Hello World" to the
screen.
>console.log("Hello World");
You should see "Hello World" output to the console screen. Now exit the
console using Ctrl+C in Windows or Cmd+C on a Mac.
Next, verify that the npm command is working by executing the following command
in the OS console prompt:
npm version
You should see output similar to the following:
Click here to view code image
{ npm: '3.10.5',
ares: '1.10.1-DEV',
http_parser: '2.7.0',
icu: '57.1',
modules: '48',
node: '6.5.0',
openssl: '1.0.2h',
uv: '1.9.1',
v8: '5.1.281.81',
zlib: '1.2.8'}
Selecting a Node.js IDE
If you are planning on using an Integrated Development Environment (IDE) for your
Node.js projects, you should take a minute and configure that now as well. Most
developers are particular about the IDE that they like to use, and there will likely be
a way to configure at least for JavaScript if not Node.js directly. For example,
Eclipse has some great Node.js plugins, and the WebStorm IDE by IntelliJ has some
good features for Node.js built in. If you are unsure of where to start, we use Visual
Studio Code for the built-in TypeScript functionality required later in this book.
That said, you can use any editor you want to generate your Node.js web
applications. In reality, all you need is a decent text editor. Almost all the code you
will generate will be .js, .json, .html, and .css. So pick the editor in which you feel
the most comfortable writing those types of files.
Working with Node Packages
One of the most powerful features of the Node.js framework is the ability to easily
extend it with additional Node Packaged Modules (NPMs) using the Node Package
Manager (NPM). That’s right, in the Node.js world, NPM stands for two things. This
book refers to the Node Packaged Modules as modules to make it easier to follow.
What Are Node Packaged Modules?
A Node Packaged Module is a packaged library that can easily be shared, reused,
and installed in different projects. Many different modules are available for a variety
of purposes. For example, the Mongoose module provides an ODM (Operational
Data Model) for MongoDB, Express extends Node’s HTTP capabilities, and so on.
Node.js modules are created by various third-party organizations to provide the
needed features that Node.js lacks out of the box. This community of contributors is
active in adding and updating modules.
Node Packaged Modules include a package.json file that defines the packages.
The package.json file includes informational metadata, such as the name,
version author, and contributors, as well as control metadata, such as dependencies
and other requirements that the Node Package Manager uses when performing
actions such as installation and publishing.
Understanding the Node Package Registry
The Node modules have a managed location called the Node Package Registry
where packages are registered. This allows you to publish your own packages in a
location where others can use them as well as download packages that others have
created.
The Node Package Registry is located at https://npmjs.com. From this location you
can view the newest and most popular modules as well as search for specific
packages, as shown in Figure 3.1.
Figure 3.1 The official Node Package Modules website
Using the Node Package Manager
The Node Package Manager you have already seen is a command-line utility. It
allows you to find, install, remove, publish, and do everything else related to Node
Package Modules. The Node Package Manager provides the link between the Node
Package Registry and your development environment.
The simplest way to really explain the Node Package Manager is to list some of the
command-line options and what they do. You use many of these options in the rest
of this chapter and throughout the book. Table 3.1 lists the Node Package Manager
commands.
Table 3.1 npm command-line options (with express as the package, where
appropriate)
Option
Description
Example
search
npm search
Finds module packages in the repository
express
install
Installs a package either using a
package.json file, from the repository,
or a local location
npm install
npm install
express
npm install
express@0.1.1
npm install
../tModule.tgz
install –
g
Installs a package globally
npm install
express -g
remove
Removes a module
npm remove
express
pack
Packages the module defined by the
package.json file into a .tgz file
npm pack
view
Displays module details
npm view
express
publish
Publishes the module defined by a
package.json file to the registry
npm publish
unpublish
Unpublishes a module you have published
npm unpublish
myModule
owner
Allows you to add, remove, and list owners
of a package in the repository
npm add
bdayley
myModule
npm rm bdayley
myModule
npm ls
myModule
Searching for Node Package Modules
You can also search for modules in the Node Package Registry directly from the
command prompt using the npm search command. For
example, the following command searches for modules related to openssl and
displays the results as shown in Figure 3.2:
npm search openssl
Figure 3.2 Searching for Node.js modules from the command prompt
Installing Node Packaged Modules
To use a Node module in your applications, it must first be installed where Node can
find it. To install a Node module, use the npm install
command. This downloads the Node module to your development environment and
places it into the node_modules folder where the install command is run. For
example, the following command installs the express module:
npm install express
The output of the npm install command displays the dependency hierarchy
installed with the module. For example, the following code block shows part of the
output from installing the express module.
Click here to view code image
C:\express\example
`-- express@4.14.0
+-- accepts@1.3.3
| +-- mime-types@2.1.11
| | `-- mime-db@1.23.0
| `-- negotiator@0.6.1
+-- array-flatten@1.1.1
+-- content-disposition@0.5.1
+-- content-type@1.0.2
+-- cookie@0.3.1
+-- cookie-signature@1.0.6
+-- debug@2.2.0
| `-- ms@0.7.1 ...
The dependency hierarchy is listed; some of the methods Express requires are
cookie-signature, range-parser, debug, fresh, cookie, and send
modules. Each of these was downloaded during the install. Notice that the version of
each dependency module is listed.
Node.js has to be able to handle dependency conflicts. For example, the express
module requires cookie 0.3.1, but another module may require cookie
0.3.0. To handle this situation, a separate copy for the cookie module is placed in
each module’s folder under another node_modules folder.
To illustrate how modules are stored in a hierarchy, consider the following example
of how express looks on disk. Notice that the cookie and send modules are
located under the express module hierarchy, and that since the send module
requires mime it is located under the send hierarchy:
Click here to view code image
./
./node_modules
./node_modules/express
./node_modules/express/node_modules/cookie
./node_modules/express/node_modules/send
./node_modules/express/node_modules/send/node_modules/mime
Using package.json
All Node modules must include a package.json file in their root directory. The
package.json file is a simple JSON text file that defines the module including
dependencies. The package.json file can contain a number of different
directives to tell the Node Package Manager how to handle the module.
The following is an example of a package.json file with a name, version,
description, and dependencies:
Click here to view code image
{
"name": "my_module",
"version": "0.1.0",
"description": "a simple node.js module",
"dependencies" : {
"express"
: "latest"
}
}
The only required directives in the package.json file are name and version.
The rest depend on what you want to include. Table 3.2 describes the most common
directives:
Table 3.2 Directives used in the package.json file
Directive
name
Description
preferGlobal
Indicates this module
prefers to be installed
globally.
"preferGlobal": true
version
Version of the module.
"version": 0.0.1
author
Author of the project.
"author":
"arthur@???.com"
description
Textual description of
module.
"description": "a silly
place"
contributors
Additional contributors to
the module.
bin
Binary to be installed
globally with project.
scripts
Specifies parameters that
execute console apps
Unique name of package.
Example
"name": "camelot"
"contributors": [
{ "name": "gwen",
"email": "gwen@???.com"}]
"bin: {
"excalibur":
"./bin/excalibur"}
"scripts" {
"start": "node ./bin/
when launching node.
excalibur",
"test": "echo testing"}
main
Specifies the main entry
point for the app. This can
be a binary or a .js file.
"main":
"./bin/excalibur"
repository
Specifies the repository
type and location of the
package.
keywords
Specifies keywords that
show up in the npm
search.
"keywords": [
"swallow", "unladen" ]
dependencies
Modules and versions this
module depends on. You
can use the * and x
wildcards.
"dependencies": {
"express": "latest",
"connect": "2.x.x,
"cookies": "*" }
engines
Version of node this
package works with.
"engines": {
"node": ">=6.5"}
"repository": {
"type": "git",
"location":
"http://???.com/c.git"}
A great way to use package.json files is to automatically download and install
the dependencies for your Node.js app. All you need to do is create a
package.json file in the root of your project code and add the necessary
dependencies to it. For example, the following package.json requires the
express module as a dependency.
Click here to view code image
{
"name": "my_module",
"version": "0.1.0",
"dependencies" : {
"express" : "latest"
}
}
Then you run the following command from root of your package, and the express
module is automatically installed.
npm install
Notice that no module is specified in the npm install. That is because npm
looks for a package.json file by default. Later, as you need additional modules,
all you need to do is add those to the dependencies directive and then run npm
install again.
Creating a Node.js Application
Now you have enough information to jump into a Node.js project and get your feet
wet. In this section, you create your own Node Packaged Module and then use that
module as a library in a Node.js application.
The code in this exercise is kept to a minimum so that you can see exactly how to
create a package, publish it, and then use it again.
Creating a Node.js Packaged Module
To create a Node.js Packaged Module you need to create the functionality in
JavaScript, define the package using a package.json file, and then either publish
it to the registry or package it for local use.
The following steps take you through the process of building a Node.js Packaged
Module using an example called censorify. The censorify module accepts
text and then replaces certain words with asterisks:
1. Create a project folder named .../censorify. This is the root of the
package.
2. Inside that folder, create a file named censortext.js.
3. Add the code from Listing 3.1 to censortext.js. Most of the code is just
basic JavaScript; however, note that lines 18–20 export the functions
censor(), addCensoredWord(), and getCensoredWords(). The
exports.censor is required for Node.js applications using this module to
have access to the censor() function as well as the other two.
Listing 3.1 censortext.js: Implementing a simple censor function and
exporting it for other modules using the package
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
var censoredWords = ["sad", "bad", "mad"];
var customCensoredWords = [];
function censor(inStr) {
for (idx in censoredWords) {
inStr = inStr.replace(censoredWords[idx], "****");
}
for (idx in customCensoredWords) {
inStr = inStr.replace(customCensoredWords[idx], "****");
}
return inStr;
}
function addCensoredWord(word){
customCensoredWords.push(word);
}
function getCensoredWords(){
return censoredWords.concat(customCensoredWords);
}
exports.censor = censor;
exports.addCensoredWord = addCensoredWord;
exports.getCensoredWords = getCensoredWords;
4. Once the module code is completed, you need to create a package.json
file that is used to generate the Node.js Packaged Module. Create a
package.json file in the .../censorify folder. Then add contents
similar to Listing 3.2. Specifically, you need to add the name, version, and
main directives as a minimum. The main directive needs to be the name of
the main JavaScript module that will be loaded, in this case censortext.
Note that the .js is not required, Node.js automatically searches for the .js
extension.
Listing 3.2 package.json: Defining the Node.js module
Click here to view code image
01 {
02
"author": "Brendan Dayley",
03
"name": "censorify",
04
"version": "0.1.1",
05
"description": "Censors words out of text",
06
"main": "censortext",
07
"dependencies": {},
08
"engines": {
09
"node": "*"
10
}
11 }
5. Create a file named README.md in the .../censorify folder. You can
put whatever read me instructions you want in this file.
6. Navigate to the .../censorify folder in a console window and run the
npm pack command to build a local package module.
7. The npm pack command creates a censorify-0.1.1.tgz file in the
.../censorify folder. This is your first Node.js Packaged Module.
Publishing a Node.js Packaged Module to the NPM Registry
In the previous section you created a local Node.js Packaged Module using the npm
pack command. You can also publish that same module to the NPM repository at
http://npmjs.com/.
When modules are published to the NPM registry, they are accessible to everyone
using the NPM manager utility discussed earlier. This allows you to distribute your
modules and applications to others more easily.
The following steps describe the process of publishing the module to the NPM
registry. These steps assume that you have completed steps 1 through 5 from the
previous section:
1. Create a public repository to contain the code for the module. Then push the
contents of the .../censorify folder up to that location. The following is
an example of a Github repository URL:
https://github.com/username/projectname/directoryName/ch03/
2. Create an account at https://npmjs.org/signup.
3. Use the npm adduser command from a console prompt to add the user you
created to the environment.
4. Type in the username, password, and email that you used to create the account
in step 2.
5. Modify the package.json file to include the new repository information
and any keywords that you want made available in the registry search as shown
in lines 7–14 in Listing 3.3.
Listing 3.3 package.json: Defining the Node.js module that includes the
repository and keywords information
Click here to view code image
01 {
02
"author": "Brad Dayley",
03
"name": "censorify",
04
"version": "0.1.1",
05
"description": "Censors words out of text",
06
"main": "censortext",
07
"repository": {
08
"type": "git",
09
//"url": "Enter your github url"
10
},
11
"keywords": [
12
"censor",
13
"words"
14
],
15
"dependencies": {},
16
"engines": {
17
"node": "*"
18
}
19 }
6. Publish the module using the following command from the .../censor
folder in the console:
npm publish
Once the package has been published you can search for it on the NPM registry and
use the npm install command to install it into your environment.
To remove a package from the registry make sure that you have added a user with
rights to the module to the environment using npm adduser and then execute the
following command:
npm unpublish
For example, the following command unpublishes the censorify module:
npm unpublish censorify
In some instances you cannot unpublish the module without using the --force
option. This option forces the removal and deletion of the module from the registry.
For example:
npm unpublish censorify --force
Using a Node.js Packaged Module in a Node.js Application
In the previous sections you learned how to create and publish a Node.js module.
This section provides an example of actually using a Node.js module inside your
Node.js applications. Node.js makes this simple: All you need to do is install the
NPM into your application structure and then use the require() method to load
the module.
The require() method accepts either an installed module name or a path to a .js
file located on the file system. For example:
require("censorify")
require("./lib/utils.js")
The .js filename extension is optional. If it is omitted, Node.js searches for it.
The following steps take you through that process so you can see how easy it is:
1. Create a project folder named .../readwords.
2. From a console prompt inside the .../readwords folder, use the following
command to install the censorify module from the censorify0.1.1.tgz package you created earlier:
npm install .../censorify/censorify-0.1.1.tgz
3. Or if you have published the censorify module, you can use the standard
command to download and install it from the NPM registry:
npm install censorify
4. Verify that a folder named node_modules is created along with a subfolder
named censorify.
5. Create a file named .../readwords/readwords.js.
6. Add the contents shown in Listing 3.4 to the readwords.js file. Notice
that a require() call loads the censorify module and assigns it to the
variable censor. Then the censor variable can be used to invoke the
getCensoredWords(), addCensoredWords(), and censor()
functions from the censorify module.
Listing 3.4 readwords.js: Loading the censorify module when displaying
text
Click here to view code image
1
2
3
4
5
6
var censor = require("censorify");
console.log(censor.getCensoredWords());
console.log(censor.censor("Some very sad, bad and mad text."));
censor.addCensoredWord("gloomy");
console.log(censor.getCensoredWords());
console.log(censor.censor("A very gloomy day."));
7. Run the readwords.js application using the node readwords.js
command and view the output shown in the following code block. Notice that
the censored words are replaced with **** and that the new censored word
gloomy is added to the censorify module instance censor.
Click here to view code image
C:\nodeCode\ch03\readwords>node readwords
[ 'sad', 'bad', 'mad' ]
Some very *****, ***** and ***** text.
[ 'sad', 'bad', 'mad', 'gloomy' ]
A very *** day.
Writing Data to the Console
One of the most useful modules in Node.js during the development process is the
console module. This module provides a lot of functionality when writing debug
and information statements to the console. The console module allows you to
control output to the console, implement time delta output, and write tracebacks and
assertions to the console. This section covers using the console module because
you need to know it for subsequent chapters in the book.
Because the console module is so widely used, you do not need to load it into
your modules using a require() statement. You simply call the console function
using console. (). Table 3.3 lists the functions
available in the console module.
Table 3.3 Member functions of the console module
Function
Description
log([data],[...])
Writes data output to the console. The data variable
can be a string or an object that can be resolved to a
string. Additional parameters can also be sent. For
example:
console.log("There are %d items", 5);
>>There are 5 items
info([data],[...])
Same as console.log.
error([data],
[...])
Same as console.log; however, the output is also
sent to stderr.
warn([data],[...])
Same as console.error.
dir(obj)
Writes out a string representation of a JavaScript
object to the console. For example:
console.dir({name:"Brad", role:"Author"});
>> { name: 'Brad', role: 'Author' }
time(label)
Assigns a current timestamp with ms precision to the
string label.
timeEnd(label)
Creates a delta between the current time and the
timestamp assigned to label and outputs the results.
For example:
console.time("FileWrite");
f.write(data); //takes about 500ms
console.timeEnd("FileWrite");
>> FileWrite: 500ms
trace(label)
Writes out a stack trace of the current position in code
to stderr. For example:
module.trace("traceMark");
>>Trace: traceMark
at Object. (C:\test.js:24:9)
at Module._compile (module.js:456:26)
at Object.Module._ext.js (module.js:474:10)
at Module.load (module.js:356:32)
at Function.Module._load (module.js:312:12)
at Function.Module.runMain(module.js:497:10)
at startup (node.js:119:16)
at node.js:901:3
assert(expression,
Writes the message and stack trace to the console if
expression evaluates to false.
[message])
Summary
This chapter focused on getting you up to speed on the Node.js environment. Node.js
Packaged Modules provide the functionality that Node.js does not inherently come
with. You can download these modules from the NPM registry, or you can even
create and publish your own. The package.json file provides the configuration
and definition for every Node.js module.
The examples in this chapter covered creating, publishing, and installing your own
Node.js Packaged Modules. You learned how to use the NPM to package a local
module as well as publish one to the NPM registry. You then learned how to install
the Node.js modules and use them in your own Node.js applications.
Next
The next chapter covers the event-driven nature of Node.js. You see how events
work in the Node.js environment and learn how to control, manipulate, and use them
in your applications.
4
Using Events, Listeners, Timers, and
Callbacks in Node.js
Node.js provides scalability and performance through its powerful event-driven
model. This chapter focuses on understanding the model and how it differs from
traditional threading models used by most webservers. Understanding the event
model is critical because it may force you to change the design thinking for your
applications. However, the changes will be well worth the improvement in speed that
you get using Node.js.
This chapter also covers the different methods you use to add work to the Node.js
event queue. You can add work using event listeners or timers, or you can schedule
work directly. You also learn how to implement events in your own custom modules
and objects.
Understanding the Node.js Event Model
Node.js applications are run in a single-threaded event-driven model. Although
Node.js implements a thread pool in the background to do work, the application
itself doesn’t have any concept of multiple threads. “Wait, what about performance
and scale?” you might ask. At first it may seem counterintuitive, but once you
understand the logic behind the Node.js event model it all makes perfect sense.
Comparing Event Callbacks and Threaded Models
In the traditional threaded web model, a request comes in to the webserver and is
assigned to an available thread. Then the handling of work for that request continues
on that thread until the request is complete and a response is sent.
Figure 4.1 illustrates the threaded model processing two requests, GetFile and
GetData. The GetFile request first opens the file, reads the contents, and then
sends the data back in a response. All this occurs in order on the same thread. The
GetData request connects to the DB, queries the necessary data, and then sends the
data in the response.
Figure 4.1 Processing two requests on individual threads using the threaded model
The Node.js event model does things differently. Instead of executing all the work
for each request on individual threads, work is added to an event queue and then
picked up by a single thread running an event loop. The event loop grabs the top
item in the event queue, executes it, and then grabs the next item. When executing
code that is no longer live or has blocking I/O, instead of calling the function
directly, the function is added to the event queue along with a callback that is
executed after the function completes. When all events on the Node.js event queue
have been executed, the Node application terminates.
Figure 4.2 illustrates the way Node.js handles the GetFile and GetData requests.
The GetFile and GetData requests are added to the event queue. Node.js first
picks up the GetFile request, executes it, and then completes by adding the
Open() callback function to the event queue. Next, it picks up the GetData
request, executes it, and completes by adding the Connect() callback function to
the event queue. This continues until there are no callback functions to be executed.
Notice in Figure 4.2 that the events for each thread do not necessarily follow a direct
interleaved order. For example, the Connect request takes longer to complete than
the Read request, so Send(file) is called before Query(db).
Figure 4.2 Processing two requests on a single event-driven thread using the Node.js
event model
Blocking I/O in Node.js
The Node.js event model of using the event callbacks is great until you run into the
problem of functions that block waiting for I/O. Blocking I/O stops the execution of
the current thread and waits for a response before continuing. Some examples of
blocking I/O are
Reading a file
Querying a database
Socket request
Accessing a remote service
The reason Node.js uses event callbacks is not to have to wait for blocking I/O.
Therefore, any requests that perform blocking I/O are performed on a different
thread in the background. Node.js implements a thread pool in the background.
When an event that requires blocking I/O is retrieved from the event queue, Node.js
retrieves a thread from the thread pool and executes the function there instead of on
the main event loop thread. This prevents the blocking I/O from holding up the rest
of the events in the event queue.
The function executed on the blocking thread can still add events back to the event
queue to be processed. For example, a database query call is typically passed a
callback function that parses the results and may schedule additional work on the
event queue before sending a response.
Figure 4.3 illustrates the full Node.js event model including the event queue, event
loop, and the thread pool. Notice that the event loop either executes the function on
the event loop thread itself or, for blocking I/O, it executes the function on a separate
thread.
The Conversation Example
To help you understand how events work in Node.js versus traditional threaded
webservers, consider the example of having different conversations with a large
group of people at a party. You are acting the part of the webserver, and the
conversations represent the work necessary to process different types of web
requests. Your conversations are broken up into several segments with different
individuals. You end up talking to one person and then another. Then you go back to
the first person and then to a third person, back to the second, and so on.
This example has many similarities to webserver processing. Some conversations
end quickly, like a simple request for a piece of data in memory. Others are broken
up into several segments as you go back and forth between individuals, similar to a
more complex server-side conversation. Still others have long breaks when you are
waiting for the other person to respond, similar to blocking I/O requests to the file
system, database, or remote service.
Using the traditional webserver threading model in the conversation example sounds
great at first because each thread acts like you. The threads/clones can talk back and
forth with each person, and it almost seems as though you can have multiple
conversations simultaneously. There are two problems with this model.
Figure 4.3 In the Node.js event model, work is added as a function with callback to
the event queue, and then picked up on the event loop thread. The function is then
executed on the event loop thread in the case of non-blocking, or on a separate thread
in the case of blocking
First, you are limited by the number of clones. What if you only have five clones?
To talk with a sixth person, one clone must completely finish its conversation. The
second problem is the limited number of CPUs (or “brains”) that the threads
(“clones”) must share. This means that clones sharing the same brain have to stop
talking/listening while other clones are using the brain. You can see that there really
isn’t a benefit to having clones when they freeze while the other clones are using the
brain.
The Node.js event model acts more like real life when compared to the conversation
example. First, Node.js applications run on a single thread, which means there is
only one of you, no clones. Each time a person asks you a question, you respond as
soon as you can. Your interactions are completely event driven, and you move
naturally from one person to the next. Therefore, you can have as many
conversations going on at the same time as you want by bouncing between
individuals. Second, your brain is always focused on the person you are talking to
since you aren’t sharing it with clones.
So how does Node.js handle blocking I/O requests? That is where the background
thread pool comes into play. Node.js hands blocking requests over to a thread in the
thread pool so that it has minimal impact on the application processing events. Think
about when someone asks you a question that you have to think about. You can still
interact with others at the party while trying to process that question in the back of
your mind. That processing may impact how fast you interact with others, but you
are still able to communicate with several people while processing the longer-lived
thought.
Adding Work to the Event Queue
As you create your Node.js applications, keep in mind the event model described in
the previous section and apply it to the way you design your code. To leverage the
scalability and performance of the event model, make sure that you break work up
into chunks that can be performed as a series of callbacks.
Once you have designed your code correctly, you can then use the event model to
schedule work on the event queue. In Node.js applications, work is scheduled on the
event queue by passing a callback function using one of these methods:
Make a call to one of the blocking I/O library calls such as writing to a file or
connecting to a database.
Add a built-in event listener to a built-in event such as an http.request or
server.connection.
Create your own event emitters and add custom listeners to them.
Use the process.nextTick option to schedule work to be picked up on the
next cycle of the event loop.
Use timers to schedule work to be done after a particular amount of time or at
periodic intervals.
The following sections discuss implementing timers, nextTick, and custom
events. They give you an idea of how the event mechanism works. The blocking I/O
calls and built-in events are covered in subsequent chapters.
Implementing Timers
A useful feature of Node.js and JavaScript is the ability to delay execution of code
for a period of time. This can be useful for cleanup or refresh work that you do not
want to always be running. There are three types of timers you can implement in
Node.js: timeout, interval, and immediate. The following sections describe each of
these timers and how to implement them in your code.
Delaying Work with Timeouts
Timeout timers are used to delay work for a specific amount of time. When that time
expires, the callback function is executed and the timer goes away. Use timeouts for
work that only needs to be performed once.
Timeout timers are created using the setTimeout(callback,
delayMilliSeconds, [args]) method built into Node.js. When you call
setTimeout(), the callback function is executed after delayMilliSeconds
expires. For example, the following executes myFunc() after 1 second:
setTimeout(myFunc, 1000);
The setTimeout() function returns a timer object ID. You can pass this ID to
clearTimeout(timeoutId) at any time before the delayMilliSeconds
expires to cancel the timeout function. For example:
myTimeout = setTimeout(myFunc, 100000);
…
clearTimeout(myTimeout);
Listing 4.1 implements a series of simple timeouts that call the
simpleTimeout() function, which outputs the number of milliseconds since the
timeout was scheduled. Notice that it doesn’t matter which order setTimeout()
is called; the results, shown in Listing 4.1 Output, are in the order that the delay
expires.
Listing 4.1 simple_timer.js: Implementing a series of timeouts at various
intervals
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
function simpleTimeout(consoleTimer){
console.timeEnd(consoleTimer);
}
console.time("twoSecond");
setTimeout(simpleTimeout, 2000, "twoSecond");
console.time("oneSecond");
setTimeout(simpleTimeout, 1000, "oneSecond");
console.time("fiveSecond");
setTimeout(simpleTimeout, 5000, "fiveSecond");
console.time("50MilliSecond");
setTimeout(simpleTimeout, 50, "50MilliSecond");
Listing 4.1 Output simple_timer.js: Timeout functions executed at
different delay amounts
Click here to view code image
C:\books\node\ch04> node simple_timer.js
50MilliSecond: 50.489ms
oneSecond: 1000.688ms
twoSecond: 2000.665ms
fiveSecond: 5000.186ms
Performing Periodic Work with Intervals
Interval timers are used to perform work on a regular delayed interval. When the
delay time expires, the callback function is executed and is then rescheduled for the
delay interval again. Use intervals for work that needs to be performed on a regular
basis.
Interval timers are created using the setInterval(callback,
delayMilliSeconds, [args]) method built into Node.js. When you call
setInterval(), the callback function is executed every interval after
delayMilliSeconds has expired. For example, the following executes
myFunc() every second:
setInterval(myFunc, 1000);
The setInterval() function returns a timer object ID. You can pass this ID to
clearInterval(intervalId) at any time before the
delayMilliSeconds expires to cancel the timeout function. For example:
myInterval = setInterval(myFunc, 100000);
…
clearInterval(myInterval);
Listing 4.2 implements a series of simple interval callbacks that update the values of
the variables x, y, and z at different intervals. Notice that the values of x, y, and z
are changed differently because the interval amounts are different, with x
incrementing twice as fast as y, which increments twice as fast as z, as shown in
Listing 4.2 Output.
Listing 4.2 simple_interval.js: Implementing a series of update callbacks
at various intervals
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
var x=0, y=0, z=0;
function displayValues(){
console.log("X=%d; Y=%d; Z=%d", x, y, z);
}
function updateX(){
x += 1;
}
function updateY(){
y += 1;
}
function updateZ(){
z += 1;
displayValues();
}
setInterval(updateX, 500);
setInterval(updateY, 1000);
setInterval(updateZ, 2000);
Listing 4.2 Output simple_interval.js: Interval functions executed at
different delay amounts
Click here to view code image
C:\books\node\ch04> node simple_interval.js
x=3; y=1; z=1
x=7; y=3; z=2
x=11; y=5; z=3
x=15; y=7; z=4
x=19; y=9; z=5
x=23; y=11; z=6
Performing Immediate Work with an Immediate Timer
Immediate timers are used to perform work on a function as soon as the I/O event
callbacks are executed, but before any timeout or interval events are executed. This
allows you to schedule work to be done after the current events in the event queue
are completed. Use immediate timers to yield long-running execution segments to
other callbacks to prevent starving the I/O events.
Immediate timers are created using the setImmediate(callback,[args])
method built into Node.js. When you call setImmediate(), the callback function
is placed on the event queue and popped off once for each iteration through the event
queue loop after I/O events have a chance to be called. For example, the following
schedules myFunc() to execute on the next cycle through the event queue:
setImmediate(myFunc(), 1000);
The setImmediate() function returns a timer object ID. You can pass this ID to
clearImmediate(immediateId) at any time before it is picked up off the
event queue. For example:
myImmediate =
…
setImmediate(myFunc);
clearImmediate(myImmediate);
Dereferencing Timers from the Event Loop
Often you do not want timer event callbacks to continue to be scheduled when they
are the only events left in the event queue. Node.js provides a useful utility to handle
this case. The unref() function available in the object returned by
setInterval and setTimeout allows you to notify the event loop to not
continue when these are the only events on the queue.
For example, the following dereferences the myInterval interval timer:
myInterval = setInterval(myFunc);
myInterval.unref();
If for some reason you later do not want the program to terminate if the interval
function is the only event left on the queue, you can use the ref() function to rereference it:
myInterval.ref();
Warning
When using unref() with setTimout timers, a separate timer is used to wake
up the event loop. Creating a lot of these can cause an adverse performance
impact on your code, so use them sparingly.
Using nextTick to Schedule Work
A useful method of scheduling work on the event queue is the
process.nextTick(callback) function. This function schedules work to be
run on the next cycle of the event loop. Unlike the setImmediate() method,
nextTick() executes before the I/O events are fired. This can result in starvation
of the I/O events, so Node.js limits the number of nextTick() events that can be
executed each cycle through the event queue by the value of
process.maxTickDepth, which defaults to 1000.
Listing 4.3 illustrates the order of events when using a blocking I/O call, timers, and
nextTick(). Notice that the blocking call fs.stat() is executed first, then two
setImmediate() calls, and then two nextTick() calls. Listing 4.3 Output
shows that both nextTick() calls are executed before any of the others. Then the
first setImmediate() call is executed followed by the fs.stat(), and then on
the next iteration through the loop, the second setImmediate() call is executed.
Listing 4.3 nexttick.js: Implementing a series of blocking fs calls,
immediate timers, and nextTick() calls to show the order in which they get
executed
Click here to view code image
01 var fs = require("fs");
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
fs.stat("nexttick.js", function(){
console.log("nexttick.js Exists");
});
setImmediate(function(){
console.log("Immediate Timer 1 Executed");
});
setImmediate(function(){
console.log("Immediate Timer 2 Executed");
});
process.nextTick(function(){
console.log("Next Tick 1 Executed");
});
process.nextTick(function(){
console.log("Next Tick 2 Executed");
});
Listing 4.3 Output nexttick.js: Executing the nextTick() calls first
Click here to view code image
c:\books\node\ch04>node nexttick.js
Next Tick 1 Executed
Next Tick 2 Executed
Immediate Timer 1 Executed
Immediate Timer 2 Executed
nexttick.js Exists
Implementing Event Emitters and Listeners
In the following chapters you get a chance to implement many of the events built in
to the various Node.js modules. This section focuses on creating your own custom
events as well as implementing listener callbacks that get implemented when an
event is emitted.
Adding Custom Events to Your JavaScript Objects
Events are emitted using an EventEmitter object. This object is included in the
events module. The emit(eventName, [args]) function triggers the
eventName event and includes any arguments provided. The following code
snippet shows how to implement a simple event emitter:
var events = require('events');
var emitter = new events.EventEmitter();
emitter.emit("simpleEvent");
Occasionally you want to add events directly to your JavaScript objects. To do that
you need to inherit the EventEmitter functionality in your object by calling
events.EventEmitter.call(this) in your object instantiation as well as
adding the events.EventEmitter. prototype to your object’s prototyping.
For example:
Function MyObj(){
Events.EventEmitter.call(this);
}
MyObj.prototype.__proto__ = events.EventEmitter.prototype;
You then can emit events directly from instances of your object. For example:
var myObj = new MyObj();
myObj.emit("someEvent");
Adding Event Listeners to Objects
Once you have an instance of an object that can emit events, you can add listeners
for the events that you care about. Listeners are added to an EventEmitter object
using one of the following functions:
.addListener(eventName, callback): Attaches the callback
function to the object’s listeners. Every time the eventName event is
triggered, the callback function is placed in the event queue to be executed.
.on(eventName, callback): Same as .addListener().
.once(eventName, callback): Only the first time the eventName
event is triggered, the callback function is placed in the event queue to be
executed.
For example, to add a listener to an instance of the MyObject EventEmitter
class defined in the previous section you would use the following:
function myCallback(){
…
}
var myObject = new MyObj();
myObject.on("someEvent", myCallback);
Removing Listeners from Objects
Listeners are useful and vital parts of Node.js programming. However, they do cause
overhead, and you should use them only when necessary. Node.js provides server
helper functions on the EventEmitter object that allow you to manage the
listeners that are included. These include
.listeners(eventName): Returns an array of listener functions attached
to the eventName event.
.setMaxListeners(n): Triggers a warning if more than n listeners are
added to an EventEmitter object. The default is 10.
.removeListener(eventName, callback): Removes the
callback function from the eventName event of the EventEmitter
object.
Implementing Event Listeners and Event Emitters
Listing 4.4 demonstrates the process of implementing listeners and custom event
emitters in Node.js. The Account object is extended to inherit from the
EventEmitter class and provides two methods to deposit and withdraw that both
emit the balanceChanged event. Then in lines 15–31, three callback functions
are implemented that are attached to the Account object instance
balanceChanged event and display various forms of data.
Notice that the checkGoal(acc, goal) callback is implemented a bit
differently than the others. This was done to illustrate how to pass variables into an
event listener function when the event is triggered. The results of executing the code
are shown in Listing 4.4 Output.
Listing 4.4 emitter_listener.js: Creating a custom EventEmitter
object and implementing three listeners that are triggered when the
balancedChanged event is triggered
Click here to view code image
01 var events = require('events');
02 function Account() {
03
this.balance = 0;
04
events.EventEmitter.call(this);
05
this.deposit = function(amount){
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
this.balance += amount;
this.emit('balanceChanged');
};
this.withdraw = function(amount){
this.balance -= amount;
this.emit('balanceChanged');
};
}
Account.prototype.__proto__ = events.EventEmitter.prototype;
function displayBalance(){
console.log("Account balance: $%d", this.balance);
}
function checkOverdraw(){
if (this.balance < 0){
console.log("Account overdrawn!!!");
}
}
function checkGoal(acc, goal){
if (acc.balance > goal){
console.log("Goal Achieved!!!");
}
}
var account = new Account();
account.on("balanceChanged", displayBalance);
account.on("balanceChanged", checkOverdraw);
account.on("balanceChanged", function(){
checkGoal(this, 1000);
});
account.deposit(220);
account.deposit(320);
account.deposit(600);
account.withdraw(1200);
Listing 4.4 Output emitter_listener.js: The account statements output
by the listener callback functions
Click here to view code image
C:\books\node\ch04>node emmiter_listener.js
Account balance: $220
Account balance: $540
Account balance: $1140
Goal Achieved!!!
Account balance: $-60
Account overdrawn!!!
Implementing Callbacks
As you have seen in previous sections, the Node.js event-driven model relies heavily
on callback functions. Callback functions can be a bit difficult to understand at first,
especially if you want to depart from implementing a basic anonymous function.
This section deals with three specific implementations of callbacks: passing
parameters to a callback function, handling callback function parameters inside a
loop, and nesting callbacks.
Passing Additional Parameters to Callbacks
Most callbacks have automatic parameters passed to them, such as an error or result
buffer. A common question when working with callbacks is how to pass additional
parameters to them from the calling function. You do this by implementing the
parameter in an anonymous function and then call the actual callback with
parameters from the anonymous function.
Listing 4.5 illustrates implementing callback parameters. There are two sawCar
event handlers. Note that the sawCar event only emits the make parameter. Notice
that the emitter.emit() function also can accept additional parameters; in this
case, make is added as shown in line 5. The first event handler on line 16
implements the logCar(make) callback handler. To add a color for
logColorCar(), an anonymous function is used in the event handler defined in
lines 17–21. A randomly selected color is passed to the call
logColorCar(make, color). You can see the output in Listing 4.5 Output.
Listing 4.5 callback_parameter.js: Creating an anonymous function to
add additional parameters not emitted by the event
Click here to view code image
01
02
03
04
05
06
07
08
09
10
var events = require('events');
function CarShow() {
events.EventEmitter.call(this);
this.seeCar = function(make){
this.emit('sawCar', make);
};
}
CarShow.prototype.__proto__ = events.EventEmitter.prototype;
var show = new CarShow();
function logCar(make){
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
console.log("Saw a " + make);
}
function logColorCar(make, color){
console.log("Saw a %s %s", color, make);
}
show.on("sawCar", logCar);
show.on("sawCar", function(make){
var colors = ['red', 'blue', 'black'];
var color = colors[Math.floor(Math.random()*3)];
logColorCar(make, color);
});
show.seeCar("Ferrari");
show.seeCar("Porsche");
show.seeCar("Bugatti");
show.seeCar("Lamborghini");
show.seeCar("Aston Martin");
Listing 4.5 Output callback_parameter.js: The results of adding a
color parameter to the callback
Click here to view code image
C:\books\node\ch04>node callback_parameter.js
Saw a Ferrari
Saw a blue Ferrari
Saw a Porsche
Saw a black Porsche
Saw a Bugatti
Saw a red Bugatti
Saw a Lamborghini
Saw a black Lamborghini
Saw a Aston Martin
Saw a black Aston Martin
Implementing Closure in Callbacks
An interesting problem that asynchronous callbacks have is that of closure. Closure
is a JavaScript term that indicates that variables are bound to a function’s scope and
not the parent function’s scope. When you execute an asynchronous callback, the
parent function’s scope may have changed; for example, when iterating through a list
and altering values in each iteration.
If your callback needs access to variables in the parent function’s scope, then you
need to provide closure so that those values are available when the callback is pulled
off the event queue. A basic way of doing that is by encapsulating the asynchronous
call inside a function block and passing in the variables that are needed.
Listing 4.6 implements a wrapper function that provides closure to the logCar()
asynchronous function. Notice that the loop in lines 7–12 implements a basic
callback. However, Listing 4.6 Output shows that the car name is always the last
item read because the value of message changes each time through the loop.
The loop in lines 13–20 implements a wrapper function that is passed message as
the msg parameter and that msg value sticks with the callback. Thus the closure
shown in Output 4.6 displays the correct message. To make the callback truly
asynchronous, the process.nextTick() method is used to schedule the
callback.
Listing 4.6 callback_closure.js: Creating a wrapper function to provide
closure for variables needed in the asynchronous callback
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
function logCar(logMsg, callback){
process.nextTick(function() {
callback(logMsg);
});
}
var cars = ["Ferrari", "Porsche", "Bugatti"];
for (var idx in cars){
var message = "Saw a " + cars[idx];
logCar(message, function(){
console.log("Normal Callback: " + message);
});
}
for (var idx in cars){
var message = "Saw a " + cars[idx];
(function(msg){
logCar(msg, function(){
console.log("Closure Callback: " + msg);
});
})(message);
}
Listing 4.6 Output callback_closure.js: Adding a closure wrapper
function allows the asynchronous callback to access necessary variables
Click here to view code image
C:\books\node\ch04>node callback_closure.js
Normal Callback: Saw a Bugatti
Normal Callback: Saw a Bugatti
Normal Callback: Saw a Bugatti
Closure Callback: Saw a Ferrari
Closure Callback: Saw a Porsche
Closure Callback: Saw a Bugatti
Chaining Callbacks
With asynchronous functions you are not guaranteed the order that they will run if
two are placed on the event queue. The best way to resolve that is to implement
callback chaining by having the callback from the asynchronous function call the
function again until there is no more work to do. That way the asynchronous
function is never on the event queue more than once.
Listing 4.7 implements a basic example of callback chaining. A list of items is
passed into the function logCars(), the asynchronous function logCar() is
called, and then the logCars() function is used as the callback when logCar()
completes. Thus only one version of logCar() is on the event queue at the same
time. The output of iterating through the list is shown in Listing 4.7 Output.
Listing 4.7 callback_chain.js: Implementing a callback chain where the
callback from an anonymous function calls back into the initial function to
iterate through a list
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
function logCar(car, callback){
console.log("Saw a %s", car);
if(cars.length){
process.nextTick(function(){
callback();
});
}
}
function logCars(cars){
var car = cars.pop();
logCar(car, function(){
logCars(cars);
});
}
var cars = ["Ferrari", "Porsche", "Bugatti",
16
"Lamborghini", "Aston Martin"];
17 logCars(cars);
Listing 4.7 Output callback_chain.js: Using an asynchronous callback
chain to iterate through a list
Click here to view code image
C:\books\node\ch04>node callback_chain.js
Saw a Aston Martin
Saw a Lamborghini
Saw a Bugatti
Saw a Porsche
Saw a Ferrari
Summary
The event-driven model that Node.js uses provides scalability and performance. You
learned the difference between the event-driven model and the traditional threaded
model for webservers. You learned that events can be added to the event queue when
blocking I/O is called. And you learned that listeners can be triggered by events or
timers or called directly using the nextTick() method.
This chapter discussed the three types of timer events: timeout, interval, and
immediate. Each of these can be used to delay the execution of work for a period of
time. You also saw how to implement your own custom event emitters and add
listener functions to them.
Next
In the next chapter you see how to manage data I/O using streams and buffers. You
also learn about Node.js functionality that allows you to manipulate JSON, string,
and compressed forms of data.
5
Handling Data I/O in Node.js
Most active web applications and services have a lot of data flowing through them.
That data comes in the form of text, JSON strings, binary buffers, and data streams.
For that reason, Node.js has many mechanisms built in to support handling the data
I/O from system to system. It is important to understand the mechanisms that
Node.js provides to implement effective and efficient web applications and services.
This chapter focuses on manipulating JSON data, managing binary data buffers,
implementing readable and writable streams, and compressing and decompressing
data. You learn how to leverage the Node.js functionality to work with different I/O
requirements.
Working with JSON
One of the most common data types that you work with when implementing Node.js
web applications and services is JSON (JavaScript Object Notation). JSON is a
lightweight method to convert JavaScript objects into a string form and then back
again. This provides an easy method when you need to serialize data objects when
passing them from client to server, process to process, stream to stream, or when
storing them in a database.
There are several reasons to use JSON to serialize your JavaScript objects over XML
including the following:
JSON is much more efficient and takes up fewer characters.
Serializing/deserializing JSON is faster than XML because it’s simpler syntax.
JSON is easier to read from a developer’s perspective because it is similar to
JavaScript syntax.
The only reasons you might want to use XML over JSON are for complex objects or
if you have XML/XSLT transforms already in place.
Converting JSON to JavaScript Objects
A JSON string represents the JavaScript object in string form. The string syntax is
similar to code, making it easy to understand. You can use the
JSON.parse(string) method to convert a string that is properly formatted with
JSON into a JavaScript object.
For example, the following code snippet defines accountStr as a formatted JSON
string and converts it to a JavaScript object using JSON.parse(). Then member
properties can be accessed via dot notation:
Click here to view code image
var accountStr = '{"name":"Jedi", "members":["Yoda","Obi Wan"], \
"number":34512, "location": "A galaxy far, far away"}';
var accountObj = JSON.parse(accountStr);
console.log(accountObj.name);
console.log(accountObj.members);
The preceding code outputs the following:
Jedi
[ 'Yoda', 'Obi Wan' ]
Converting JavaScript Objects to JSON
Node also allows you to convert a JavaScript object into a properly formatted JSON
string. Thus the string form can be stored in a file or database, sent across an HTTP
connection, or written to a stream/buffer. Use the JSON.stringify(text)
method to parse JSON text and generate a JavaScript object:
For example, the following code defines a JavaScript object that includes string,
numeric, and array properties. Using JSON.stringify(), it is all converted to a
JSON string:
Click here to view code image
var accountObj = {
name: "Baggins",
number: 10645,
members: ["Frodo, Bilbo"],
location: "Shire"
};
var accountStr = JSON.stringify(accountObj);
console.log(accountStr);
The preceding code outputs the following:
{"name":"Baggins","number":10645,"members":["Frodo, Bilbo"],"location":"Shire"}
Using the Buffer Module to Buffer Data
While JavaScript is Unicode friendly, it is not good at managing binary data.
However, binary data is useful when implementing some web applications and
services. For example:
Transferring compressed files
Generating dynamic images
Sending serialized binary data
Understanding Buffered Data
Buffered data is made up of a series of octets in big endian or little endian format.
That means they take up considerably less space than textual data. Therefore,
Node.js provides the Buffer module that gives you the functionality to create,
read, write, and manipulate binary data in a buffer structure. The Buffer module is
global, so you do not need to use the require() statement to access it.
Buffered data is stored in a structure similar to that of an array but is stored outside
the normal V8 heap in raw memory allocations. Therefore a Buffer cannot be
resized.
When converting buffers to and from strings, you need to specify the explicit
encoding method to be used. Table 5.1 lists the various encoding methods supported.
Table 5.1 Methods of encoding between strings and binary buffers
Method
Description
utf8
Multi-byte encoded Unicode characters used as the standard in most
documents and webpages.
utf16le
Little endian encoded Unicode characters of 2 or 4 bytes.
ucs2
Same as utf16le.
base64
Base64 string encoding.
Hex
Encode each byte as two hexadecimal characters.
Big Endian and Little Endian
Binary data in buffers is stored as a series of octets or a sequence of eight 0s and
1s that can be a hexadecimal value of 0x00 to 0xFF. It can be read as a single byte
or as a word containing multiple bytes. Endian defines the ordering of significant
bits when defining the word. Big endian stores the least significant word first, and
little endian stores the least significant word last. For example, the words 0x0A
0x0B 0x0C 0x0D would be stored in the buffer as [0x0A, 0x0B, 0x0C,
0x0D] in big endian but as [0x0D, 0x0C, 0x0B, 0x0A] in little endian.
Creating Buffers
Buffer objects are actually raw memory allocations; therefore, their size must be
determined when they are created. The three methods for creating Buffer objects
using the new keyword are
new Buffer(sizeInBytes)
new Buffer(octetArray)
new Buffer(string, [encoding])
For example, the following lines of code define buffers using a byte size, octet
buffer, and a UTF8 string:
var buf256 = new Buffer(256);
var bufOctets = new Buffer([0x6f, 0x63, 0x74, 0x65, 0x74, 0x73]);
var bufUTF8 = new Buffer("Some UTF8 Text \u00b6 \u30c6 \u20ac", 'utf8');
Writing to Buffers
You cannot extend the size of a Buffer object after it has been created, but you can
write data to any location in the buffer. Table 5.2 describes the three methods you
can use when writing to buffers.
Table 5.2 Methods of writing from Buffer objects
Method
buffer.write(string,
[offset], [length],
[encoding])
Description
buffer[offset] =
value
Replaces the data at index offset with the value
specified.
buffer.fill(value,
[offset], [end])
Writes the value to every byte in the buffer starting
at the offset index and ending with the end
index.
writeInt8(value,
offset, [noAssert])
There is a wide range of methods for Buffer objects
to write integers, unsigned integers, doubles, and
floats of various sizes using little endian or big
endian. value specifies the value to write,
offset specifies the index to write to, and
noAssert specifies whether to skip validation of
the value and offset. noAssert should be
left at the default false unless you are absolutely
certain of correctness.
writeInt16LE(value,
offset, [noAssert])
writeInt16BE(value,
offset, [noAssert])
…
Writes length number of bytes from the string
starting at the offset index inside the buffer
using encoding.
To illustrate writing to buffers better, Listing 5.1 defines a buffer, fills it with zeros,
writes some text at the beginning using the write() method at line 4, and then
adds some additional text using a write that alters part of the existing buffer using
write(string, offset, length) at line 6. Then in line 8 it adds a + to the
end by directly setting the value of an index, as shown in Listing 5.1 Output. Notice
that the buf256.write("more text", 9, 9) statement writes to the middle
of the buffer and buf256[18] = 43 changes a single byte.
Listing 5.1 buffer_write.js: Various ways to write to a Buffer object
Click here to view code image
1
2
3
4
buf256 = new Buffer(256);
buf256.fill(0);
buf256.write("add some text");
console.log(buf256.toString());
5
6
7
8
buf256.write("more text", 9, 9);
console.log(buf256.toString());
buf256[18] = 43;
console.log(buf256.toString());
Listing 5.1 Output buffer_write.js: Writing data from a Buffer object
Click here to view code image
C:\books\node\ch05>node buffer_write.js
add some text
add some more text
add some more text+
Reading from Buffers
There are several methods for reading from buffers. The simplest is to use the
toString() method to convert all or part of a buffer to a string. However, you
can also access specific indexes in the buffer directly or by using read(). Also
Node.js provides a StringDecoder object that has a write(buffer) method
that decodes and writes buffered data using the specified encoding. Table 5.3
describes these methods for reading Buffer objects.
Table 5.3 Methods of reading from Buffer objects
Method
buffer.toString([encoding],
[start], [end])
Description
stringDecoder.write(buffer)
Returns a decoded string version of the
buffer.
buffer[offset]
Returns the octet value in the buffer at
the specified offset.
readInt8(offset,
[noAssert])
There is a wide range of methods for
Buffer objects to read integers,
unsigned integers, doubles, and floats of
various sizes using little endian or big
readInt16LE(offset,
Returns a string containing the decoded
characters specified by encoding from the
start index to the end index of the
buffer. If start or end is not specified,
then toString() uses the beginning or
end of the buffer.
[noAssert])
readInt16BE(offset,
[noAssert])
…
endian. These functions accept the offset
to read from an optional noAssert
Boolean value that specifies whether to
skip validation of the offset. noAssert
should be left at the default false
unless you are absolutely certain of
correctness.
To illustrate reading from buffers, Listing 5.2 defines a buffer with UTF8 encoded
characters, and then uses toString() without parameters to read all the buffer,
and then with the encoding, start, and end parameters to read part of the
buffer. Then in lines 4 and 5 it creates a StringDecoder with UTF8 encoding
and uses it to write the contents of the buffer out to the console. Next, a direct access
method is used to get the value of the octet at index 18. Listing 5.2 Output shows the
output of the code.
Listing 5.2 buffer_read.js: Various ways to read from a Buffer object
Click here to view code image
1
2
3
4
5
6
bufUTF8 = new Buffer("Some UTF8 Text \u00b6 \u30c6 \u20ac", 'utf8');
console.log(bufUTF8.toString());
console.log(bufUTF8.toString('utf8', 5, 9));
var StringDecoder = require('string_decoder').StringDecoder;
var decoder = new StringDecoder('utf8');
console.log(decoder.write(bufUTF8));
Listing 5.2 Output buffer_read.js: Reading data from a Buffer object
Click here to view code image
C:\books\node\ch05>node buffer_read.js
Some UTF8 Text ¶ テ €
UTF8
Some UTF8 Text ¶ テ €
e3
e3838620
Determining Buffer Length
A common task when dealing with buffers is determining the length, especially
when you create a buffer dynamically from a string. The length of a buffer can be
determined by calling .length on the Buffer object. To determine the byte
length that a string takes up in a buffer you cannot use the .length property.
Instead you need to use Buffer.byteLength(string, [encoding]).
Note that there is a difference between the string length and byte length of a buffer.
To illustrate this consider the followings statements:
Click here to view code image
"UTF8 text \u00b6".length;
//evaluates to 11
Buffer.byteLength("UTF8 text \u00b6", 'utf8');
//evaluates to 12
Buffer("UTF8 text \u00b6").length;
//evaluates to 12
Notice that the same string evaluates to 11 characters, but because it contains a
double-byte character the byteLength is 12. Also note that Buffer("UTF8
text \u00b6").length evaluates to 12 also. That is because .length on a
buffer returns the byte length.
Copying Buffers
An important part of working with buffers is the ability to copy data from one buffer
into another buffer. Node.js provides the copy(targetBuffer,
[targetStart], [sourceStart], [sourceIndex]) method on
Buffer objects. The targetBuffer parameter is another Buffer object, and
targetStart, sourceStart, and sourceEnd are indexes inside the source
and target buffers.
Note
To copy string data from one buffer to the next, make sure that both buffers use
the same encoding or you may get unexpected results when decoding the resulting
buffer.
You can also copy data from one buffer to the other by indexing them directly, for
example:
sourceBuffer[index] = destinationBuffer[index]
Listing 5.3 illustrates three examples of copying data from one buffer to another. The
first method in lines 4–8 copies the full buffer. The next method in lines 10–14
copies only the middle 5 bytes of a buffer. The third example iterates through the
source buffer and only copies every other byte in the buffer. The results are shown in
Listing 5.3 Output.
Listing 5.3 buffer_copy.js: Various ways to copy data from one Buffer
object to another
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
var alphabet = new Buffer('abcdefghijklmnopqrstuvwxyz');
console.log(alphabet.toString());
// copy full buffer
var blank = new Buffer(26);
blank.fill();
console.log("Blank: " + blank.toString());
alphabet.copy(blank);
console.log("Blank: " + blank.toString());
// copy part of buffer
var dashes = new Buffer(26);
dashes.fill('-');
console.log("Dashes: " + dashes.toString());
alphabet.copy(dashes, 10, 10, 15);
console.log("Dashes: " + dashes.toString());
// copy to and from direct indexes of buffers
var dots = new Buffer('-------------------------');
dots.fill('.');
console.log("dots: " + dots.toString());
for (var i=0; i < dots.length; i++){
if (i % 2) { dots[i] = alphabet[i]; }
}
console.log("dots: " + dots.toString());
Listing 5.3 Output buffer_copy.js: Copying data from one Buffer
object to another
Click here to view code image
C:\books\node\ch05>node buffer_copy.js
abcdefghijklmnopqrstuvwxyz
Blank:
Blank: abcdefghijklmnopqrstuvwxyz
Dashes: -------------------------Dashes: ----------klmno----------dots: .........................
dots: .b.d.f.h.j.l.n.p.r.t.v.x.
Slicing Buffers
Another important aspect of working with buffers is the ability to divide them into
slices. A slice is a section of a buffer between a starting index and an ending index.
Slicing a buffer allows you to manipulate a specific chunk.
Slices are created using the slice([start], [end]) method, which returns a
Buffer object that points to start index of the original buffer and has a length of
end – start. Keep in mind that a slice is different from a copy. If you edit a copy,
the original does not change. However, if you edit a slice, the original does change.
Listing 5.4 illustrates using slices. Note that when the slice is altered in lines 5 and 6,
it also alters the original buffer, as shown in Listing 5.4 Output.
Listing 5.4 buffer_slice.js: Creating and manipulating slices of a Buffer
object
Click here to view code image
1
2
3
4
5
6
7
8
var numbers = new Buffer("123456789");
console.log(numbers.toString());
var slice = numbers.slice(3, 6);
console.log(slice.toString());
slice[0] = '#'.charCodeAt(0);
slice[slice.length-1] = '#'.charCodeAt(0);
console.log(slice.toString());
console.log(numbers.toString());
Listing 5.4 Output buffer_slice.js: Slicing and modifying a Buffer
object
Click here to view code image
C:\books\node\ch05>node buffer_slice.js
123456789
456
#5#
123#5#789
Concatenating Buffers
You can also concatenate two or more Buffer objects together to form a new
buffer. The concat(list, [totalLength]) method accepts an array of
Buffer objects as the first parameter, and totalLength defines the maximum
bytes in the buffer as an optional second argument. The Buffer objects are
concatenated in the order they appear in the list, and a new Buffer object is
returned containing the contents of the original buffers up to totalLength bytes.
If you do not provide a totalLength parameter, concat() figures it out for
you. However, it has to iterate through the list, so providing a totalLength value
is faster.
Listing 5.5 concatenates a base Buffer with one buffer and then another, as shown
in Listing 5.5 Output.
Listing 5.5 buffer_concat.js: Concatenating Buffer objects
Click here to view code image
1
2
3
4
5
var af = new Buffer("African Swallow?");
var eu = new Buffer("European Swallow?");
var question = new Buffer("Air Speed Velocity of an ");
console.log(Buffer.concat([question, af]).toString());
console.log(Buffer.concat([question, eu]).toString());
Listing 5.5 Output buffer_concat.js: Concatenating Buffer objects
Click here to view code image
C:\books\node\ch05>node buffer_concat.js
Air Speed Velocity of an African Swallow?
Air Speed Velocity of an European Swallow?
Using the Stream Module to Stream Data
An important module in Node.js is the stream module. Data streams are memory
structures that are readable, writable, or both. Streams are used all over in Node.js,
for example, when accessing files or reading data from HTTP requests and in several
other areas. This section covers using the Stream module to create streams as well
as read and write data from them.
The purpose of streams is to provide a common mechanism to transfer data from one
location to another. They also expose events, such as when data is available to be
read, when an error occurs, and so on. You can then register listeners to handle the
data when it becomes available in a stream or is ready to be written to.
Some common uses for streams are HTTP data and files. You can open a file as a
readable stream or access the data from an HTTP request as a readable stream and
read bytes out as needed. Additionally, you can create your own custom streams. The
following sections describe the process of creating and using readable, writable,
duplex, and transform streams.
Readable Streams
Readable streams provide a mechanism to easily read data coming into your
application from another source. Some common examples of readable streams are
HTTP responses on the client
HTTP requests on the server
fs read streams
zlib streams
crypto streams
TCP sockets
Child processes stdout and stderr
process.stdin
Readable streams provide the read([size]) method to read data where size
specifies the number of bytes to read from the stream. read() can return a
String, Buffer or null. Readable streams also expose the following events:
readable: Emitted when a chunk of data can be read from the stream.
data: Similar to readable except that when data event handlers are
attached, the stream is turned into flowing mode, and the data handler is
called continuously until all data has been drained.
end: Emitted by the stream when data will no longer be provided.
close: Emitted when the underlying resource, such as a file, has been closed.
error: Emitted when an error occurs receiving data.
Readable stream objects also provide a number of functions that allow you to read
and manipulate them. Table 5.4 lists the methods available on a Readable stream
object.
Table 5.4 Methods available on Readable stream objects
Method
read([size])
Description
setEncoding(encoding)
Sets the encoding to use when returning String
in the read() request.
pause()
This pauses data events from being emitted by the
object.
resume()
The resumes data events being emitted by the
object.
pipe(destination,
[options])
This pipes the output of this stream into a
Writable stream object specified by
destination. options in a JavaScript
object. For example, {end:true} ends the
Writable destination when the Readable
ends.
unpipe([destination])
Disconnects this object from the Writable
destination.
Reads data from the stream. The data can be a
String, Buffer, or null, meaning there is no
more data left. If a size argument is read, then
the data is limited to that number of bytes.
To implement your own custom Readable stream object, you need to first inherit
the functionality for Readable streams. The simplest way to do that is to use the
util module’s inherits() method:
var util = require('util');
util.inherits(MyReadableStream, stream.Readable);
Then you create an instance of the object call:
stream.Readable.call(this, opt);
You also need to implement a _read() method that calls push() to output the
data from the Readable object. The push() call should push either a String,
Buffer, or null.
Listing 5.6 illustrates the basics of implementing and reading from a Readable
stream. Notice that the Answers() class inherits from Readable and then
implements the Answers.prototye._read() function to handle pushing data
out. Also notice that on line 18, a direct read() call reads the first item from the
stream and then the rest of the items are read by the data event handler defined on
lines 19–21. Listing 5.6 Output shows the result.
Listing 5.6 stream_read.js: Implementing a Readable stream object
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
var stream = require('stream');
var util = require('util');
util.inherits(Answers, stream.Readable);
function Answers(opt) {
stream.Readable.call(this, opt);
this.quotes = ["yes", "no", "maybe"];
this._index = 0;
}
Answers.prototype._read = function() {
if (this._index > this.quotes.length){
this.push(null);
} else {
this.push(this.quotes[this._index]);
this._index += 1;
}
};
var r = new Answers();
console.log("Direct read: " + r.read().toString());
r.on('data', function(data){
console.log("Callback read: " + data.toString());
});
r.on('end', function(data){
console.log("No more answers.");
});
Listing 5.6 Output stream_read.js: Implementing a custom Readable
object
Click here to view code image
C:\books\node\ch05>node stream_read.js
Direct read: yes
Callback read: no
Callback read: maybe
No more answers.
Writable Streams
Writable streams are designed to provide a mechanism to write data into a form
that can easily be consumed in another area of code. Some common examples of
Writable streams are
HTTP requests on the client
HTTP responses on the server
fs write streams
zlib streams
crypto streams
TCP sockets
Child process stdin
process.stdout, process.stderr
Writable streams provide the write(chunk, [encoding],
[callback]) method to write data into the stream, where chunk contains the
data to write, encoding specifies the string encoding if necessary, and callback
specifies a callback function to execute when the data has been fully flushed. The
write() function returns true if the data was written successfully. Writable
streams also expose the following events:
drain: After a write() call returns false, the drain event is emitted to
notify listeners when it is okay to begin writing more data.
finish: Emitted when end() is called on the Writable object; all data is
flushed and no more data will be accepted.
pipe: Emitted when the pipe() method is called on a Readable stream to
add this Writable as a destination.
unpipe: Emitted when the unpipe() method is called on a Readable
stream to remove this Writable as a destination.
Writable stream objects also provide a number of methods that allow you to write
and manipulate them. Table 5.5 lists the methods available on a Writable stream
object.
Table 5.5 Methods available on Writable stream objects
Method
write(chunk,
[encoding],
[callback])
Description
end([chunk],
[encoding],
[callback])
Same as write(), except it puts the Writable into a state
where it no longer accepts data and sends the finish event.
Writes the data chunk to the stream object’s data location. The
data can be a String or Buffer. If encoding is specified,
then it is used to encode string data. If callback is specified,
then it is called after the data has been flushed.
To implement your own custom Writable stream object, you need to first inherit
the functionality for Writable streams. The simplest way to do that is to use the
util module’s inherits() method:
var util = require('util');
util.inherits(MyWritableStream, stream.Writable);
Then you create an instance of the object call:
stream. Writable.call(this, opt);
You also need to implement a _write(data, encoding, callback)
method that stores the data for the Writable object. Listing 5.7 illustrates the
basics of implementing and writing to a Writable stream. Listing 5.7 Output
shows the result.
Listing 5.7 stream_write.js: Implementing a Writable stream object
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
var stream = require('stream');
var util = require('util');
util.inherits(Writer, stream.Writable);
function Writer(opt) {
stream.Writable.call(this, opt);
this.data = new Array();
}
Writer.prototype._write = function(data, encoding, callback) {
this.data.push(data.toString('utf8'));
console.log("Adding: " + data);
callback();
};
var w = new Writer();
for (var i=1; i<=5; i++){
15
w.write("Item" + i, 'utf8');
16 }
17 w.end("ItemLast");
18 console.log(w.data);
Listing 5.7 Output stream_ write.js: Implementing a custom Writable
object
Click here to view code image
C:\books\node\ch05>node stream_write.js
Adding: Item1
Adding: Item2
Adding: Item3
Adding: Item4
Adding: Item5
Adding: ItemLast
[ 'Item1', 'Item2', 'Item3', 'Item4', 'Item5', 'ItemLast' ]
Duplex Streams
A Duplex stream combines Readable and Writable functionality. A good
example of a duplex stream is a TCP socket connection. You can read and write
from the socket connection once it has been created.
To implement your own custom Duplex stream object, you need to first inherit the
functionality for Duplex streams. The simplest way to do that is to use the util
module’s inherits() method:
var util = require('util');
util.inherits(MyDuplexStream, stream.Duplex);
Then you create an instance of the object call:
stream. Duplex.call(this, opt);
The opt parameter when creating a Duplex stream accepts an object with the
property allowHalfOpen set to true or false. If this option is true, then the
readable side stays open even after the writable side has ended and vice versa. If this
option is set to false, ending the writable side also ends the readable side and vice
versa.
When you implement a Duplex stream, you need to implement both a
_read(size) and a _write(data, encoding, callback) method when
prototyping your Duplex class.
Listing 5.8 illustrates the basics of implementing writing to and reading from a
Duplex stream. The example is basic but shows the main concepts. The
Duplexer() class inherits from the Duplex stream and implements a
rudimentary _write() function that stores data in an array in the object. The
_read() function uses shift() to get the first item in the array and then ends by
pushing null if it is equal to "stop", pushes it if there is a value, or sets a timeout
timer to call back to the _read() function if there is no value.
In Listing 5.8 Output, notice that the first two writes "I think, " and
"therefore" are read together. This is because both were pushed to the
Readable before the data event was triggered.
Listing 5.8 stream_duplex.js: Implementing a Duplex stream object
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
var stream = require('stream');
var util = require('util');
util.inherits(Duplexer, stream.Duplex);
function Duplexer(opt) {
stream.Duplex.call(this, opt);
this.data = [];
}
Duplexer.prototype._read = function readItem(size) {
var chunk = this.data.shift();
if (chunk == "stop"){
this.push(null);
} else {
if(chunk){
this.push(chunk);
} else {
setTimeout(readItem.bind(this), 500, size);
}
}
};
Duplexer.prototype._write = function(data, encoding, callback) {
this.data.push(data);
callback();
};
var d = new Duplexer();
d.on('data', function(chunk){
26
27
28
29
30
31
32
33
34
35
console.log('read: ', chunk.toString());
});
d.on('end', function(){
console.log('Message Complete');
});
d.write("I think, ");
d.write("therefore ");
d.write("I am.");
d.write("Rene Descartes");
d.write("stop");
Listing 5.8 Output stream_ duplex.js: Implementing a custom Duplex
object
Click here to view code image
C:\books\node\ch05>node stream_duplex.js
read: I think,
read: therefore
read: I am.
read: Rene Descartes
Message Complete
Transform Streams
Another type of stream is the Transform stream. A Transform stream extends
the Duplex stream but modifies the data between the Writable stream and the
Readable stream. This can be useful when you need to modify data from one
system to another. Some examples of Transform streams are
zlib streams
crypto streams
A major difference between the Duplex and the Transform streams is that for
Transforms you do not need to implement the _read() and _write()
prototype methods. These are provided as pass-through functions. Instead, you
implement the _transform(chunk, encoding, callback) and
_flush(callback) methods. The _transform() method should accept the
data from write() requests, modify it, and then push() out the modified data.
Listing 5.9 illustrates the basics of implementing a Transform stream. The stream
accepts JSON strings, converts them to objects, and then emits a custom event
named object that sends the object to any listeners. The _transform()
function also modifies the object to include a handled property and then sends a
string form on. Notice that lines 18–21 implement the object event handler
function that displays certain attributes. In Listing 5.9 Output, notice that the JSON
strings now include the handled property.
Listing 5.9 stream_transform.js: Implementing a Transform stream
object
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
var stream = require("stream");
var util = require("util");
util.inherits(JSONObjectStream, stream.Transform);
function JSONObjectStream (opt) {
stream.Transform.call(this, opt);
};
JSONObjectStream.prototype._transform = function (data, encoding, callb
object = data ? JSON.parse(data.toString()) : "";
this.emit("object", object);
object.handled = true;
this.push(JSON.stringify(object));
callback();
};
JSONObjectStream.prototype._flush = function(cb) {
cb();
};
var tc = new JSONObjectStream();
tc.on("object", function(object){
console.log("Name: %s", object.name);
console.log("Color: %s", object.color);
});
tc.on("data", function(data){
console.log("Data: %s", data.toString());
});
tc.write('{"name":"Carolinus", "color": "Green"}');
tc.write('{"name":"Solarius", "color": "Blue"}');
tc.write('{"name":"Lo Tae Zhao", "color": "Gold"}');
tc.write('{"name":"Ommadon", "color": "Red"}');
Listing 5.9 Output stream_transform.js: Implementing a custom
Transform object
Click here to view code image
C:\books\node\ch05>node stream_transform.js
Name: Carolinus
Color: Green
Data: {"name":"Carolinus","color":"Green","handled":true}
Name: Solarius
Color: Blue
Data: {"name":"Solarius","color":"Blue","handled":true}
Name: Lo Tae Zhao
Color: Gold
Data: {"name":"Lo Tae Zhao","color":"Gold","handled":true}
Name: Ommadon
Color: Red
Data: {"name":"Ommadon","color":"Red","handled":true}
Piping Readable Streams to Writable Streams
One of the coolest things you can do with stream objects is to chain Readable
streams to Writable streams using the pipe(writableStream,
[options]) function. This does exactly what the name implies. The output from
the Readable stream is directly input into the Writable stream. The options
parameter accepts an object with the end property set to true or false. When
end is true, the Writable stream ends when the Readable stream ends. This
is the default behavior. For example:
readStream.pipe(writeStream, {end:true});
You can also break the pipe programmatically using the
unpipe(destinationStream) option. Listing 5.10 implements a Readable
stream and a Writable stream and then uses the pipe() function to chain them
together. To show you the basic process, the data input from the _write() method
is output to the console in Listing 5.10 Output.
Listing 5.10 stream_piped.js: Piping a Readable stream into a
Writable stream
Click here to view code image
01
02
03
04
05
06
07
08
var stream = require('stream');
var util = require('util');
util.inherits(Reader, stream.Readable);
util.inherits(Writer, stream.Writable);
function Reader(opt) {
stream.Readable.call(this, opt);
this._index = 1;
}
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Reader.prototype._read = function(size) {
var i = this._index++;
if (i > 10){
this.push(null);
} else {
this.push("Item " + i.toString());
}
};
function Writer(opt) {
stream.Writable.call(this, opt);
this._index = 1;
}
Writer.prototype._write = function(data, encoding, callback) {
console.log(data.toString());
callback();
};
var r = new Reader();
var w = new Writer();
r.pipe(w);
Listing 5.10 Output stream_ piped.js: Implementing stream piping
Click here to view code image
C:\books\node\ch05>node stream_piped.js
Item 1
Item 2
Item 3
Item 4
Item 5
Item 6
Item 7
Item 8
Item 9
Item 10
Compressing and Decompressing Data with
Zlib
When working with large systems or moving large amounts of data around, it is
helpful to be able to compress and decompress the data. Node.js provides an
excellent library in the Zlib module that allows you to easily and efficiently
compress and decompress data in buffers.
Keep in mind that compressing data takes CPU cycles. So you should be certain of
the benefits of compressing the data before incurring the
compression/decompression cost. The compression methods supported by Zlib are
gzip/gunzip: Standard gzip compression
deflate/inflate: Standard deflate compression algorithm based on Huffman
coding
deflateRaw/inflateRaw: Deflate compression algorithm on a raw buffer
Compressing and Decompressing Buffers
The Zlib module provides several helper functions that make it easy to compress
and decompress data buffers. These all use the same basic format of
function(buffer, callback), where function is the compression/decompression
method, buffer is the buffer to be compressed/decompressed, and callback is the
callback function executed after the compression/decompression occurs.
The simplest way to illustrate buffer compression/decompression is to show you
some examples. Listing 5.11 provides several compression/decompression examples,
and the size result of each example is shown in Listing 5.11 Output.
Click here to view code image
Listing 5.11 zlib_buffers.js: Compressing/decompressing buffers using
the Zlib module
01 var zlib = require("zlib");
02 var input = '...............text...............';
03 zlib.deflate(input, function(err, buffer) {
04
if (!err) {
05
console.log("deflate (%s): ", buffer.length, buffer.toString('base6
06
zlib.inflate(buffer, function(err, buffer) {
07
if (!err) {
08
console.log("inflate (%s): ", buffer.length, buffer.toString())
09
}
10
});
11
zlib.unzip(buffer, function(err, buffer) {
12
if (!err) {
13
console.log("unzip deflate (%s): ", buffer.length, buffer.toStr
14
}
15
});
16
}
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
});
zlib.deflateRaw(input, function(err, buffer) {
if (!err) {
console.log("deflateRaw (%s): ", buffer.length, buffer.toString('ba
zlib.inflateRaw(buffer, function(err, buffer) {
if (!err) {
console.log("inflateRaw (%s): ", buffer.length, buffer.toString
}
});
}
});
zlib.gzip(input, function(err, buffer) {
if (!err) {
console.log("gzip (%s): ", buffer.length, buffer.toString('base64')
zlib.gunzip(buffer, function(err, buffer) {
if (!err) {
console.log("gunzip (%s): ", buffer.length, buffer.toString());
}
});
zlib.unzip(buffer, function(err, buffer) {
if (!err) {
console.log("unzip gzip (%s): ", buffer.length, buffer.toString
}
});
}
});
Listing 5.11 Output zilb_ buffers.js: Compressing/decompressing
buffers
Click here to view code image
C:\books\node\ch05>node zlib_buffers.js
deflate (18): eJzT00MBJakVJagiegB9Zgcq
deflateRaw (12): 09NDASWpFSWoInoA
gzip (30): H4sIAAAAAAAAC9PTQwElqRUlqCJ6AIq+x+AiAAAA
inflate (34): ...............text...............
unzip deflate (34): ...............text...............
inflateRaw (34): ...............text...............
gunzip (34): ...............text...............
unzip gzip (34): ...............text...............
Compressing/Decompressing Streams
Compressing/decompressing streams using Zlib is slightly different from
compressing/decompressing buffers. Instead, you use the pipe() function to pipe
the data from one stream through the compression/decompression object into another
stream. This can apply to compressing any Readable streams into Writable
streams.
A good example of doing this is compressing the contents of a file using
fs.ReadStream and fs.WriteStream. Listing 5.12 shows an example of
compressing the contents of a file using a zlib.Gzip() object and then
decompressing back using a zlib.Gunzip() object.
Listing 5.12 zlib_file.js: Compressing/decompressing a file stream using
the Zlib module
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
var zlib = require("zlib");
var gzip = zlib.createGzip();
var fs = require('fs');
var inFile = fs.createReadStream('zlib_file.js');
var outFile = fs.createWriteStream('zlib_file.gz');
inFile.pipe(gzip).pipe(outFile);
gzip.flush();
outFile.close();
var gunzip = zlib.createGunzip();
var inFile = fs.createReadStream('zlib_file.gz');
var outFile = fs.createWriteStream('zlib_file.unzipped');
inFile.pipe(gunzip).pipe(outFile);
Summary
At the heart of most intense web applications and services is a lot of data streaming
from one system to another. In this chapter, you learned how to use functionality
built into Node.js to work with JSON data, manipulate binary buffer data, and utilize
data streams. You also learned about compressing buffered data as well as running
data streams through compression/decompression.
Next
In the next chapter, you see how to interact with the file system from Node.js. You
get a chance to read/write files, create directories, and read file system information.
6
Accessing the File System from Node.js
Interacting with the file system in Node.js is important especially if you need to
manage dynamic files to support a web application or service. Node.js provides a
good interface for interacting with the file system in the fs module. This module
provides the standard file access APIs that are available in most languages to open,
read, write, and interact with files.
This chapter provides you with the fundamentals necessary to access the file system
from Node.js applications. You should come away with the ability to create, read,
and modify files as well as navigate the directory structure. You also learn how to
access file and folder information as well as delete, truncate, and rename files and
folders.
For all the file system calls discussed in this chapter, you need to have loaded the fs
module, for example:
var fs
= require('fs');
Synchronous Versus Asynchronous File System
Calls
The fs module provided in Node.js makes almost all functionality available in two
forms: asynchronous and synchronous. For example, there is the asynchronous form
write() and the synchronous form writeSync(). It is important to understand
the difference when you are implementing your code.
Synchronous file system calls block until the call completes and then control is
released back to the thread. This has advantages but can also cause severe
performance issues in Node.js if synchronous calls block the main event thread or
too many of the background thread pool threads. Therefore, synchronous file system
calls should be limited in use when possible.
Asynchronous calls are placed on the event queue to be run later. This allows the
calls to fit into the Node.js event model; however, this can be tricky when executing
your code because the calling thread continues to run before the asynchronous call
gets picked up by the event loop.
For the most part, the underlying functionality of both synchronous and
asynchronous file system calls is exactly the same. They both accept the same
parameters with the exception that all asynchronous calls require an extra parameter
at the end, which is a callback function to execute when the file system call
completes.
The following list describes the important differences between synchronous and
asynchronous file system calls in Node.js:
Asynchronous calls require a callback function as an extra parameter. The
callback function is executed when the file system request completes, and
typically contains an error as its first parameter.
Exceptions are automatically handled by asynchronous calls, and an error object
is passed as the first parameter if an exception occurs. Exceptions in
synchronous calls must be handled by your own try/catch blocks of code.
Synchronous calls are run immediately, and execution does not return to the
current thread until they are complete. Asynchronous calls are placed on the
event queue, and execution returns to the running thread code, but the actual
call will not execute until picked up by the event loop.
Opening and Closing Files
Node provides synchronous and asynchronous methods for opening files. Once a file
is opened, you can read data from it or write data to it depending on the flags used to
open the file. To open files in a Node.js app, use one of the following statements for
asynchronous or synchronous:
fs.open(path, flags, [mode], callback)
fs.openSync(path, flags, [mode])
The path parameter specifies a standard path string for your file system. The
flags parameter specifies what mode to open the file in—read, write, append, and
so on—as described in Table 6.1. The optional mode parameter sets the file access
mode and defaults to 0666, which is readable and writable.
Table 6.1 Flags that define how files are opened
Mode Description
r
Open file for reading. An exception occurs if the file does not exist.
r+
Open file for reading and writing. An exception occurs if the file does not
exist.
rs
Open file for reading in synchronous mode. This is not the same as forcing
fs.openSync(). When used, the OS bypasses the local file system
cache. Useful on NFS mounts because it lets you skip the potentially stale
local cache. You should only use this flag if necessary because it can have a
negative impact on performance.
rs+
Same as rs except the file is open file for reading and writing.
w
Open file for writing. The file is created if it does not exist or truncated if it
does exist.
wx
Same as w but fails if the path exists.
w+
Open file for reading and writing. The file is created if it does not exist or
truncated if it exists.
wx+
Same as w+ but fails if path exists.
a
Open file for appending. The file is created if it does not exist.
ax
Same as a but fails if the path exists.
a+
Open file for reading and appending. The file is created if it does not exist.
ax+
Same as a+ but fails if the path exists.
Once a file has been opened, you need to close it to force flushing changes to disk
and release the OS lock. Closing a file is done using one of the following methods
and passing the file handle to it. In the case of the asynchronous close() call, you
also need to specify a callback function:
fs.close(fd, callback)
fs.closeSync(fd)
The following shows an example of opening and closing a file in asynchronous
mode. Notice that a callback function is specified that receives an err and an fd
parameter. The fd parameter is the file descriptor that you can use to read or write to
the file:
fs.open("myFile", 'w', function(err, fd){
if (!err){
fs.close(fd);
}
});
The following shows an example of opening and closing a file in synchronous mode.
Notice that a there is no callback function and that the file descriptor used to read
and write to the file is returned directly from fs.openSync():
var fd = fs.openSync("myFile", 'w');
fs.closeSync(fd);
Writing Files
The fs module provides four different ways to write data to files. You can write
data to a file in a single call, write chunks using synchronous writes, write chunks
using asynchronous writes, or stream writes through a Writable stream. Each of
these methods accepts either a String or a Buffer object as input. The following
sections describe how to use these methods.
Simple File Write
The simplest method for writing data to a file is to use one of the writeFile()
methods. These methods write the full contents of a String or Buffer to a file.
The following shows the syntax for the writeFile() methods:
fs.writeFile(path, data, [options], callback)
fs.writeFileSync(path, data, [options])
The path parameter specifies the path to the file. The path can be relative or
absolute. The data parameter specifies the String or Buffer object to be
written to the file. The optional options parameter is an object that can contain
encoding, mode, and flag properties that define the string encoding as well as
the mode and flags used when opening the file. The asynchronous method also
requires a callback that is called when the file write has been completed.
Listing 6.1 implements a simple asynchronous fileWrite() request to store a
JSON string of a config object in a file. Listing 6.1 Output shows the output of the
code.
Listing 6.1 file_write.js: Writing a JSON string to a file
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
var fs = require('fs');
var config = {
maxFiles: 20,
maxConnections: 15,
rootPath: "/webroot"
};
var configTxt = JSON.stringify(config);
var options = {encoding:'utf8', flag:'w'};
fs.writeFile('config.txt', configTxt, options, function(err){
if (err){
console.log("Config Write Failed.");
} else {
console.log("Config Saved.");
}
});
Listing 6.1 Output file_write.js: Writing a configuration file
Click here to view code image
C:\books\node\ch06\writing>node file_write.js
Config Saved.
Synchronous File Writing
The synchronous method of file writing writes the data to the file before returning
execution to the running thread. This provides the advantage of allowing you to write
multiple times in the same section of code, but this can be a disadvantage if the file
writes hold up other threads as discussed earlier.
To write to a file synchronously, first open it using openSync() to get a file
descriptor and then use fs.writeSync() to write data to the file. The following
shows the syntax for fs.writeSync():
fs.writeSync(fd, data, offset, length, position)
The fd parameter is the file descriptor returned by openSync(). The data
parameter specifies the String or Buffer object to be written to the file. The
offset parameter specifies the index in the input data to begin reading from; if
you want to begin at the current index in the String or Buffer, this value should
be null. The length specifies the number of bytes to write; specifying null
writes until the end of the data buffer. The position argument specifies the
position in the file to begin writing at; specifying null for this value uses the
current file position.
Listing 6.2 illustrates implementing basic synchronous writing to store a series of
string data in a file. Listing 6.2 Output shows the result.
Listing 6.2 file_write_sync.js: Performing synchronous writes to a file
Click here to view code image
1
2
3
4
5
6
7
8
9
var fs = require('fs');
var veggieTray = ['carrots', 'celery', 'olives'];
fd = fs.openSync('veggie.txt', 'w');
while (veggieTray.length){
veggie = veggieTray.pop() + " ";
var bytes = fs.writeSync(fd, veggie, null, null);
console.log("Wrote %s %dbytes", veggie, bytes);
}
fs.closeSync(fd);
Listing 6.2 Output file_write_sync.js: Writing synchronously to a file
Click here to view code image
C:\books\node\ch06\writing>node file_write_sync.js
Wrote olives 7bytes
Wrote celery 7bytes
Wrote carrots 8bytes
Asynchronous File Writing
The asynchronous method of file writing puts the write request on the event queue
and then returns control back to the calling code. The actual write does not take place
until the event loop picks up the write request and executes it. You need to be careful
when performing multiple asynchronous write requests on the same file, since you
cannot guarantee what order they will be executed unless you wait for the first write
callback before executing the next. Typically the simplest way to do this is to nest
writes inside the callback from the previous write. Listing 6.3 illustrates that process.
To write to a file asynchronously, first open it using open() and then after the
callback from the open request has executed, use fs.write() to write data to the
file. The following shows the syntax for fs.write():
fs.writeSync(fd, data, offset, length, position, callback)
The fd parameter is the file descriptor returned by openSync(). The data
parameter specifies the String or Buffer object to be written to the file. The
offset parameter specifies the index in the input data to begin reading data; if you
want to begin at the current index in the String or Buffer, this value should be
null. The length specifies the number of bytes to write; specifying null writes
until the end of the buffer. The position argument specifies the position in the file
to begin writing at; specifying null for this value uses the current file position.
The callback argument must be a function that can accept two parameters,
error and bytes, where error is an error that occurred during the write and
bytes specifies the number of bytes written.
Listing 6.3 illustrates implementing basic asynchronous writing to store a series of
string data in a file. Notice that the callback specified in lines 18–20 in the open()
callback calls the writeFruit() function and passes the file descriptor. Also
notice that the write() callback specified in lines 6–13 also calls
writeFruit() and passes the file descriptor. This ensures that the asynchronous
write completes before executing another. Listing 6.3 Output shows the output of the
code.
Listing 6.3 file_write_async.js: Performing asynchronous writes to a file
Click here to view code image
01 var fs = require('fs');
02 var fruitBowl = ['apple', 'orange', 'banana', 'grapes'];
03 function writeFruit(fd){
04
if (fruitBowl.length){
05
var fruit = fruitBowl.pop() + " ";
06
fs.write(fd, fruit, null, null, function(err, bytes){
07
if (err){
08
console.log("File Write Failed.");
09
} else {
10
console.log("Wrote: %s %dbytes", fruit, bytes);
11
writeFruit(fd);
12
}
13
});
14
} else {
15
fs.close(fd);
16
}
17 }
18 fs.open('fruit.txt', 'w', function(err, fd){
19
writeFruit(fd);
20 });
Listing 6.3 Output file_write_async.js: Writing asynchronously to a file
Click here to view code image
C:\books\node\ch06\writing>node file_write_async.js
Wrote: grapes 7bytes
Wrote: banana 7bytes
Wrote: orange 7bytes
Wrote: apple 6bytes
Streaming File Writing
One of the best methods to use when writing large amounts of data to a file is the
streaming method. This method opens the file as a Writable stream. As discussed
in Chapter 5, “Handling Data I/O in Node.js,” Writable streams can easily be
implemented and linked to Readable streams using the pipe() method, which
makes it easy to write data from a Readable stream source such as an HTTP
request.
To stream data to a file asynchronously, you first need to create a Writable stream
object using the following syntax:
fs.createWriteStream(path, [options])
The path parameter specifies the path to the file and can be relative or absolute.
The optional options parameter is an object that can contain encoding, mode,
and flag properties that define the string encoding as well as the mode and flags
used when opening the file.
Once you have opened the Writable file stream, you can write to it using the
standard stream write(buffer) methods. When you are finished writing, call the
end() method to close the stream.
Listing 6.4 illustrates implementing a basic Writable file stream. Notice that when
the code is finished writing, the end() method is executed on line 13, which
triggers the close event. Listing 6.4 Output shows the output of the code.
Listing 6.4 file_write_stream.js: Implementing a Writable stream to
allow streaming writes to a file
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
var fs = require('fs');
var grains = ['wheat', 'rice', 'oats'];
var options = { encoding: 'utf8', flag: 'w' };
var fileWriteStream = fs.createWriteStream("grains.txt",
fileWriteStream.on("close", function(){
console.log("File Closed.");
});
while (grains.length){
var data = grains.pop() + " ";
fileWriteStream.write(data);
console.log("Wrote: %s", data);
}
fileWriteStream.end();
options);
Listing 6.4 Output file_write_stream.js: Implementing streaming writes
to a file
Click here to view code image
C:\books\node\ch06\writing>node file_write_stream.js
Wrote: oats
Wrote: rice
Wrote: wheat
File Closed.
Reading Files
The fs module also provides four different ways to read data from files. You can
read data in one large chunk, read chunks of data using synchronous writes, read
chunks of data using asynchronous writes, or stream reads through a Readable
stream. Each of these methods is effective. Which one you should use depends on
the particular needs of your application. The following sections describe how to use
and implement these methods.
Simple File Read
The simplest method for reading data to a file is to use one of the readFile()
methods. These methods read the full contents of a file into a data buffer. The
following shows the syntax for the readFile() methods:
fs.readFile(path, [options], callback)
fs.readFileSync(path, [options])
The path parameter specifies the path to the file and can be relative or absolute.
The optional options parameter is an object that can contain encoding, mode,
and flag properties that define the string encoding as well as the mode and flags
used when opening the file. The asynchronous method also requires a callback
that is called when the file read has been completed.
Listing 6.5 illustrates implementing a simple asynchronous readFile() request to
read a JSON string from a configuration file and then use it to create a config
object. Listing 6.5 Output shows the output of the code.
Listing 6.5 file_read.js: Reading a JSON string file to an object
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
var fs = require('fs');
var options = {encoding:'utf8', flag:'r'};
fs.readFile('config.txt', options, function(err, data){
if (err){
console.log("Failed to open Config File.");
} else {
console.log("Config Loaded.");
var config = JSON.parse(data);
console.log("Max Files: " + config.maxFiles);
console.log("Max Connections: " + config.maxConnections);
console.log("Root Path: " + config.rootPath);
}
});
Listing 6.5 Output file_read.js: Reading a configuration file to an object
Click here to view code image
C:\books\node\ch06\reading>node file_read.js
Config Loaded.
Max Files: 20
Max Connections: 15
Root Path: /webroot
Synchronous File Reading
The synchronous method of file reading reads the data from the file before returning
execution to the running thread. This provides the advantage of allowing you to read
multiple times in the same section of code, but this can be a disadvantage if the file
reads hold up other threads as discussed earlier.
To read to a file synchronously, first open it using openSync() to get a file
descriptor and then use readSync() to read data from the file. The following
shows the syntax for readSync():
fs.readSync(fd, buffer, offset, length, position)
The fd parameter is the file descriptor returned by openSync(). The buffer
parameter specifies the Buffer object that data will be read into from the file. The
offset parameter specifies the index in the buffer to begin writing data; if you
want to begin at the current index in the Buffer this value should be null. The
length specifies the number of bytes to read; specifying null writes until the end
of the buffer. The position argument specifies the position in the file to begin
reading from; specifying null for this value uses the current file position.
Listing 6.6 illustrates implementing basic synchronous reading to read a chunk of
string data from a file. Listing 6.6 Output shows the output of the code.
Listing 6.6 file_read_sync.js: Performing synchronous reads from a file
Click here to view code image
01
02
03
04
05
06
07
08
var fs = require('fs');
fd = fs.openSync('veggie.txt', 'r');
var veggies = "";
do {
var buf = new Buffer(5);
buf.fill();
var bytes = fs.readSync(fd, buf, null, 5);
console.log("read %dbytes", bytes);
09
veggies += buf.toString();
10 } while (bytes > 0);
11 fs.closeSync(fd);
12 console.log("Veg g (to get output shown) ies: " + veggies);
Listing 6.6 Output file_read_sync.js: Reading synchronously from a file
Click here to view code image
C:\books\node\ch06\reading>node file_read_sync.js
read 5bytes
read 5bytes
read 5bytes
read 5bytes
read 2bytes
read 0bytes
Veggies: olives celery carrots
Asynchronous File Reading
The asynchronous method of file reading puts the read request on the event queue
and then returns control back to the calling code. The actual read does not take place
until the event loop picks up the read request and executes it. You need to be careful
when performing multiple asynchronous read requests on the same file, since you
cannot guarantee what order they will be executed unless you wait for the first read
callback to execute before executing the next read. Typically the simplest way to do
this is to nest reads inside the callback from the previous read. Listing 6.7 illustrates
that process.
To read from a file asynchronously, first open it using open() and then after the
callback from the open request has executed, use read() to read data from the file.
The following shows the syntax for read():
fs.read(fd, buffer, offset, length, position, callback)
The fd parameter is the file descriptor returned by openSync(). The buffer
parameter specifies the Buffer object that data will be read into from the file. The
offset parameter specifies the index in the buffer to begin reading data; if you
want to begin at the current index in the Buffer, this value should be null. The
length specifies the number of bytes to read; specifying null reads until the end
of the buffer. The position argument specifies the position in the file to begin
reading from; specifying null for this value uses the current file position.
The callback argument must be a function that can accept three parameters:
error, bytes, and buffer. The error parameter is an error that occurred
during the read, bytes specifies the number of bytes read, and buffer is the
buffer with data populated from the read request.
Listing 6.7 illustrates implementing basic asynchronous reading to read chunks of
data from a file. Notice that the callback specified in lines 16–18 in the open()
callback calls the readFruit() function and passes the file descriptor. Also
notice that the read() callback specified in lines 5–13 also calls readFruit()
and passes the file descriptor. This ensures that the asynchronous read completes
before executing another. Listing 6.7 Output shows the output of the code.
Listing 6.7 file_read_async.js: Performing asynchronous reads from a
file
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
var fs = require('fs');
function readFruit(fd, fruits){
var buf = new Buffer(5);
buf.fill();
fs.read(fd, buf, 0, 5, null, function(err, bytes, data){
if ( bytes > 0) {
console.log("read %dbytes", bytes);
fruits += data;
readFruit(fd, fruits);
} else {
fs.close(fd);
console.log ("Fruits: %s", fruits);
}
});
}
fs.open('fruit.txt', 'r', function(err, fd){
readFruit(fd, "");
});
Listing 6.7 Output file_read_async.js: Reading asynchronously from a
file
Click here to view code image
C:\books\node\ch06\reading>node file_read_async.js
read 5bytes
read 5bytes
read 5bytes
read 5bytes
read 5bytes
read 2bytes
Fruits: grapes banana orange apple
Streaming File Reading
One of the best methods to use when reading large amounts of data from a file is the
streaming method. This method opens the file as a Readable stream. As discussed
in Chapter 5, Readable streams can easily be implemented and linked to
Writable streams using the pipe() method. This makes it easy to read data from
a file and inject it into a Writable stream source such as an HTTP response.
To stream data from a file asynchronously, you first need to create a Readable
stream object using the following syntax:
fs.createReadStream(path, [options])
The path parameter specifies the path to the file. The path can be relative or
absolute. The optional options parameter is an object that can contain
encoding, mode, and flag properties that define the string encoding as well as
the mode and flags used when opening the file.
Once you have opened the Readable file stream, you can easily read from it using
the readable event with read() requests or by implementing a data event
handler as shown in Listing 6.8.
Listing 6.8 illustrates implementing a basic Readable file stream. Notice that lines
4–7 implement a data event handler that continuously reads data from the stream.
Listing 6.8 Output shows the output of the code.
Listing 6.8 file_read_stream.js: Implementing a Readable stream to
allow streaming reads from a file
Click here to view code image
01
02
03
04
05
06
var fs = require('fs');
var options = { encoding: 'utf8', flag: 'r' };
var fileReadStream = fs.createReadStream("grains.txt",
fileReadStream.on('data', function(chunk) {
console.log('Grains: %s', chunk);
console.log('Read %d bytes of data.', chunk.length);
options);
07 });
08 fileReadStream.on("close", function(){
09
console.log("File Closed.");
10 });
Listing 6.8 Output file_read_stream.js: Implementing streaming reads
from a file
Click here to view code image
C:\books\node\ch06\reading>node file_read_stream.js
Grains: oats rice wheat
Read 16 bytes of data.
File Closed.
Other File System Tasks
In addition to reading and writing files, the fs module also provides functionality
for interacting with the file system—for example, listing files in a directory, looking
at file information, and much more. The following sections cover the most common
file system tasks that you may need to implement when creating Node.js
applications.
Verifying Path Existence
Before doing any kind of read/write operation on a file or directory, you might want
to verify whether the path exists. This can easily be done using one of the following
methods:
fs.exists(path, callback)
fs.existsSync(path)
The fs.existsSync(path) returns true or false based on the path
existence. Just as with any other asynchronous file system call, if you use
fs.exists(), you need to implement a callback that is executed when the call
completes. The callback is passed a Boolean value of true or false
depending on whether the path exists. For example, the following code verifies the
existence of a file named filesystem.js in the current path and displays the
results:
fs.exists('filesystem.js', function (exists) {
console.log(exists ? "Path Exists" : "Path Does Not Exist");
});
Getting File Info
Another common task is to get basic information about file system objects such as
file size, the mode, modify time, whether the entry is a file or folder, and so on. This
information can be obtained using one of the following calls:
fs.stats(path, callback)
fs.statsSync(path)
The fs.statsSync() method returns a Stats object, whereas the
fs.stats() method is executed and the Stats object is passed to the callback
function as the second parameter. The first parameter is error if an error occurs.
Table 6.2 lists some of the most commonly used attributes and methods attached to
the Stats object.
Table 6.2 Attributes and methods of Stats objects for file system entries
Attribute/Method
isFile()
Description
isDirectory()
Returns true if the entry is a directory
isSocket()
Returns true if the entry is a socket
dev
Specifies the device ID on which the file is located
mode
Specifies the access mode of the file
size
Specifies the number of bytes in the file
blksize
Specifies the block size used to store the file in bytes
blocks
Specifies the number of blocks the file is taking on disk
atime
Specifies the time the file was last accessed
mtime
Specifies the time the file was last modified
ctime
Specifies the time the file was created
Returns true if the entry is a file
Listing 6.9 illustrates the use of the fs.stats() call by making the call and then
outputting the results of the object as a JSON string as well as using the isFile(),
isDirector(), and isSocket() calls, as shown in Listing 6.9 Output.
Listing 6.9 file_stats.js: Implementing an fs.stats() call to retrieve
information about a file
Click here to view code image
01 var fs = require('fs');
02 fs.stat('file_stats.js', function (err, stats) {
03
if (!err){
04
console.log('stats: ' + JSON.stringify(stats, null, ' '));
05
console.log(stats.isFile() ? "Is a File" : "Is not a File");
06
console.log(stats.isDirectory() ? "Is a Folder" : "Is not a Folder"
07
console.log(stats.isSocket() ? "Is a Socket" : "Is not a Socket");
08
stats.isDirectory();
09
stats.isBlockDevice();
10
stats.isCharacterDevice();
11
//stats.isSymbolicLink(); //only lstat
12
stats.isFIFO();
13
stats.isSocket();
14
}
15 });
Listing 6.9 Output file_stats.js: Displaying information about a file
Click here to view code image
C:\books\node\ch06>node file_stats.js
stats: {
"dev": 818973644,
"mode": 33206,
"nlink": 1,
"uid": 0,
"gid": 0,
"rdev": 0,
"ino": 1970324837052284,
"size": 535,
"atime": "2016-09-14T18:03:26.572Z",
"mtime": "2013-11-26T21:51:51.148Z",
"ctime": "2014-12-18T17:30:43.340Z",
"birthtime": "2016-09-14T18:03:26.572Z"
}
Is a File
Is not a Folder
Is not a Socket
Listing Files
Another common task when working with the file system is listing files and folders
in a directory. For example, you might want to determine whether the files and
folders need to be cleaned up, you might need to dynamically operate on the
directory structure, and so on.
To access the files in the file system, use one of the following commands to read a
list of entries:
fs.readdir(path, callback)
fs.readdirSync(path)
If readdirSync() is called, an array of strings representing the entry names in
the specified path is returned. In the case of readdir(), the list is passed as the
second parameter to the callback function and an error, if there is one, is passed
as the first.
To illustrate the use of readdir(), Listing 6.10 implements a nested callback
chain to walk the directory structure and output the entries. Notice that the callback
function implements a wrapper to provide closure for the fullPath variable, and
that the WalkDirs() function loops by being called by the asynchronous callback
function, as shown in Listing 6.10 Output.
Listing 6.10 file_readdir.js: Implementing a callback chain to walk down
and output the contents of a directory structure
Click here to view code image
01 var fs = require('fs');
02 var Path = require('path');
03 function WalkDirs(dirPath){
04
console.log(dirPath);
05
fs.readdir(dirPath, function(err, entries){
06
for (var idx in entries){
07
var fullPath = Path.join(dirPath, entries[idx]);
08
(function(fullPath){
09
fs.stat(fullPath, function (err, stats){
10
if (stats.isFile()){
11
console.log(fullPath);
12
} else if (stats.isDirectory()){
13
WalkDirs(fullPath);
14
}
15
});
16
})(fullPath);
17
}
18
});
19 }
20 WalkDirs("../ch06");
Listing 6.10 Output file_readdir.js: Iteratively walking the directory
structure using chained asynchronous callbacks
Click here to view code image
C:\books\node\ch06>node file_readdir.js
../ch06
..\ch06\file_readdir.js
..\ch06\filesystem.js
..\ch06\data
..\ch06\file_stats.js
..\ch06\file_folders.js
..\ch06\renamed
..\ch06\reading
..\ch06\writing
..\ch06\data\config.txt
..\ch06\data\folderA
..\ch06\data\grains.txt
..\ch06\data\fruit.txt
..\ch06\reading\file_read.js
..\ch06\data\veggie.txt
..\ch06\data\log.txt
..\ch06\data\output.txt
..\ch06\writing\file_write.js
..\ch06\reading\file_read_async.js
..\ch06\reading\file_read_sync.js
..\ch06\reading\file_read_stream.js
..\ch06\writing\file_write_async.js
..\ch06\writing\file_write_stream.js
..\ch06\writing\file_write_sync.js
..\ch06\data\folderA\folderC
..\ch06\data\folderA\folderB
..\ch06\data\folderA\folderB\folderD
..\ch06\data\folderA\folderC\folderE
Deleting Files
Another common task when working with files is deleting them to clean up data or
make more room on the file system. To delete a file from Node.js, use one of the
following commands:
fs.unlink(path, callback)
fs.unlinkSync(path)
The unlinkSync(path) returns true or false based on whether the delete is
successful. The asynchronous unlink() call passes back an error value to the
callback function if an error is encountered when deleting the file.
The following code snippet illustrates the process of deleting a file named new.txt
using the unlink() asynchronous fs call:
fs.unlink("new.txt", function(err){
console.log(err ? "File Delete Failed" :
});
"File Deleted");
Truncating Files
Truncating a file means reducing the size of the file by setting the end to a smaller
value than the current size. You might want to truncate a file that grows continuously
but does not contain critical data, such as a temporary log. To truncate a file, use one
the following fs calls and pass in the number of bytes you want the file to contain
when the truncation completes:
fs.truncate(path, len, callback)
fs.truncateSync(path, len)
The truncateSync(path) returns true or false based on whether the file is
successfully truncated. The asynchronous truncate() call passes an error value
to the callback function if an error is encountered when truncating the file.
The following code snippet illustrates the process of truncating a file named
log.txt to zero bytes.
fs.truncate("new.txt", function(err){
console.log(err ? "File Truncate Failed" :
});
"File Truncated");
Making and Removing Directories
At times you may need to implement a directory structure for files being stored by
your Node.js application. The fs module provides the functionality to add and
remove directories as necessary.
To add a directory from Node.js, use one of the following fs calls. The path can be
absolute or relative. The optional mode parameter allows you to specify the access
mode for the new directory.
fs.mkdir(path, [mode], callback)
fs.mkdirSync(path, [mode])
The mkdirSync(path) returns true or false based on whether the directory
is successfully created. The asynchronous mkdir() call passes an error value to the
callback function if an error is encountered when creating the directory.
Keep in mind that when using the asynchronous method, you need to wait for the
callback for the creation of the directory before creating a subdirectory. The
following code snippet shows how to chain the creation of a subdirectory structure
together:
Click here to view code image
fs.mkdir("./data/folderA", function(err){
fs.mkdir("./data/folderA/folderB", function(err){
fs.mkdir("./data/folderA/folderB/folderD", function(err){
});
});
fs.mkdir("./data/folderA/folderC", function(err){
fs.mkdir("./data/folderA/folderC/folderE", function(err){
});
});
});
To delete a directory from Node.js, use one of the following fs calls. The path can
be absolute or relative.
fs.rmdir(path, callback)
fs.rmdirSync(path)
The rmdirSync(path) returns true or false based on whether the directory
is successfully deleted. The asynchronous rmdir() call passes an error value to the
callback function if an error is encountered when deleting the directory.
Just as with the mkdir() calls, keep in mind that when using the asynchronous
method, you need to wait for the callback of the deletion of the directory before
deleting the parent directory. The following code snippet shows how to chain the
deletion of a subdirectory structure together:
Click here to view code image
fs.rmdir("./data/folderA/folderB/folderC", function(err){
fs.rmdir("./data/folderA/folderB", function(err){
fs.rmdir("./data/folderD", function(err){
});
});
fs.rmdir("./data/folderA/folderC", function(err){
fs.rmdir("./data/folderE", function(err){
});
});
});
Renaming Files and Directories
You might also need to rename files and folders in your Node.js application to make
room for new data, archive old data, or apply changes made by a user. Renaming
files and folders uses the fs calls shown here:
fs.rename(oldPath, newPath, callback)
fs.renameSync(oldPath, newPath)
The oldPath specifies the existing file or directory path, and the newPath
specifies the new name. The renameSync(path) returns true or false based
on whether the file or directory is successfully renamed. The asynchronous
rename() call passes an error value to the callback function if an error is
encountered when renaming the file or directory.
The following code snippet illustrates implementing fs calls to rename a file named
old.txt to new.txt and a directory named testDir to renamedDir:
Click here to view code image
fs.rename("old.txt", "new.txt", function(err){
console.log(err ? "Rename Failed" : "File Renamed");
});
fs.rename("testDir", "renamedDir", function(err){
console.log(err ? "Rename Failed" : "Folder Renamed");
});
Watching for File Changes
Although not entirely stable, the fs module provides a useful tool to watch a file and
execute a callback function when the file changes. This can be useful if you want to
trigger events to occur when a file is modified, but do not want to continually poll
from your application directly. This does incur some overhead in the underlying OS,
so you should use watches sparingly.
To implement a watch on a file, use the following command passing the path to the
file you want to watch. You can also pass in options, which is an object that
contains persistent and interval properties. The persistent property is
true if you want the process to continue to run as long as files are being watched.
The interval property specifies the time in milliseconds that you want the file to
be polled for changes:
fs.watchFile(path, [options], callback)
When a file change occurs, the callback function is executed and passes a current
and previous Stats object.
The following code example monitors a file named log.txt at an interval of every
5 seconds and uses the Stats object to output the current and previous times the
file was modified:
Click here to view code image
fs.watchFile("log.txt", {persistent:true, interval:5000}, function (curr, prev)
console.log("log.txt modified at: " + curr.mtime);
console.log("Previous modification was: " + prev.mtime);
});
Summary
Node.js provides the fs module that allows you to interact with the file system. The
fs module allows you to create, read, and modify files. You can also use the fs
module to navigate the directory structure, look at information about files and
folders, and change the directory structure by deleting and renaming files and
folders.
Next
The next chapter focuses on using the http module to implement basic webservers.
You see how to parse query strings and also how to implement a basic webserver in
Node.js.
7
Implementing HTTP Services in
Node.js
One of the most important aspects of Node.js is the ability to quickly implement
HTTP and HTTPS servers and services. Node.js provides the http and https
modules out of the box, and they provide the basic framework to do most everything
you need from an HTTP and HTTPS standpoint. In fact, it is not difficult to
implement a full webserver using just the http module.
That said, you will likely use a different module, such as express, to implement a
full-on webserver. This is because the http module is pretty low level. It doesn’t
provide calls to handle routing, cookies, caching, and so on. When you get to the
Express chapters later in this book, you will see the advantages it provides.
What you will more likely be using the http module for is implementing backend
web services for your applications to use. That is where the http module becomes
an invaluable tool in your arsenal. You can create basic HTTP servers that provide
an interface for communications behind your firewall and then basic HTTP clients
that interact with those services.
Therefore, this chapter focuses on understanding the objects that come into play
when implementing clients and servers using the http module. The examples in
this chapter are basic so that they are easy to consume and expand on.
Processing URLs
The Uniform Resource Locator (URL) acts as an address label for the HTTP server
to handle requests from the client. It provides all the information needed to get the
request to the correct server on a specific port and access the proper data.
The URL can be broken down into several different components, each providing a
basic piece of information for the webserver on how to route and handle the HTTP
request from the client. Figure 7.1 illustrates the basic structure of a URL and the
components that may be included. Not all these components are included in every
HTTP request. For example, most requests do not include the auth component, and
many do not include a query string or hash location.
Figure 7.1 Basic components that can be included in a URL
Understanding the URL Object
HTTP requests from the client include the URL string with the information shown in
Figure 7.1. To use the URL information more effectively, Node.js provides the url
module that provides functionality to convert the URL string into a URL object.
To create a URL object from the URL string, pass the URL string as the first
parameter to the following method:
url.parse(urlStr, [parseQueryString], [slashesDenoteHost])
The url.parse() method takes the URL string as the first parameter. The
parseQueryString parameter is a Boolean that when true also parses the
query string portion of the URL into an object literal. The default is false. The
slashesDenoteHost is also a Boolean that when true parses a URL with the
format of //host/path to {host: 'host', pathname: '/path'}
instead of {pathname: '//host/path'}. The default is false.
You can also convert a URL object into a string form using the following
url.parse() method. Table 7.1 lists the attributes of the URL objects created by
url.parse():
url.format(urlObj)
The following shows an example of parsing a URL string into an object and then
converting it back into a string:
Click here to view code image
var url = require('url');
var urlStr = 'http://user:pass@host.com:80/resource/path?query=string#hash
var urlObj = url.parse(urlStr, true, false);
urlString = url.format(urlObj);
Table 7.1 Properties of the URL object
Property
href
Description
protocol
The request protocol lowercased.
host
The full host portion of the URL including port information
lowercased.
auth
The authentication information portion of a URL.
hostname
The hostname portion of the host lowercased.
port
The port number portion of the host.
pathname
The path portion of the URL including the initial slash if present.
search
The query string portion of the URL including the leading question
mark.
path
The full path including the pathname and search.
query
This is either the parameter portion of the query string or a parsed
object containing the query string parameters and values if the
parseQueryString is set to true.
hash
The hash portion of the URL including the pound sign (#).
This is the full URL string that was originally parsed.
Resolving the URL Components
Another useful feature of the url module is the ability to resolve URL components
in the same manner as a browser would. This allows you to manipulate the URL
strings on the server side to make adjustments in the URL. For example, you might
want to change the URL location before processing the request because a resource
has moved or changed parameters.
To resolve a URL to a new location use the following syntax:
url.resolve(from, to)
The from parameter specifies the original base URL string. The to parameter
specifies the new location where you want the URL to resolve. The following code
illustrates an example of resolving a URL to a new location.
Click here to view code image
var url = require('url');
var originalUrl = 'http://user:pass@host.com:80/resource/path?query=string#hash
var newResource = '/another/path?querynew';
console.log(url.resolve(originalUrl, newResource));
The output of the previous code snippet is shown below. Notice that only the
resource path and beyond are altered in the resolved URL location:
http://user:pass@host.com:80/another/path?querynew
Processing Query Strings and Form Parameters
HTTP requests often include query strings in the URL or parameter data in the body
for form submissions. The query string can be obtained from the URL object defined
in the previous section. The parameter data sent by a form request can be read out of
the body of the client request, as described later in this chapter.
The query string and form parameters are just basic key-value pairs. To actually
consume these values in your Node.js webserver you need to convert the string into a
JavaScript object using the parse() method from the querystring module:
querystring.parse(str, [sep], [eq], [options])
The str parameter is the query or parameter string. The sep parameter allows you
to specify the separator character used. The default separator character is &. The eq
parameter allows you to specify the assignment character to use when parsing. The
default is =. The options parameter is an object with the property maxKeys that
allows you to limit the number of keys the resulting object can contain. The default
is 1000. If you specify 0, there is no limit.
The following shows an example of using parse() to parse a query string:
Click here to view code image
var qstring = require('querystring');
var params = qstring.parse("name=Brad&color=red&color=blue");
The params object created would be:
{name: 'Brad', color: ['red', 'blue']}
You can also go back the other direction and convert an object to a query string
using the stringify() function shown here:
querystring.stringify(obj, [sep], [eq])
Understanding Request, Response, and Server
Objects
To use the http module in Node.js applications, you first need to understand the
request and response objects. They provide the information and much of the
functionality that comes into and out of the HTTP clients and servers. Once you see
the makeup of these objects—including properties, events, and methods they provide
—it will be simple to implement your own HTTP servers and clients.
The following sections cover the purpose and behavior of the ClientRequest,
ServerResponse, IncomingMessage, and Server objects. The most
important events, properties, and methods that each provides also are covered.
The http.ClientRequest Object
The ClientRequest object is created internally when you call
http.request() when building the HTTP client. This object represents the
request while it is in progress to the server. You use the ClientRequest object to
initiate, monitor, and handle the response from the server.
The ClientRequest implements a Writable stream, so it provides all the
functionality of a Writable stream object. For example, you can use the
write() method to write to it as well as pipe a Readable stream into it.
To implement a ClientRequest object, you use a call to http.request()
using the following syntax:
http.request(options, callback)
The options parameter is an object whose properties define how to open and send
the client HTTP request to the server. Table 7.2 lists the properties that you can
specify. The callback parameter is a callback function that is called after the
request is sent to the server and handles the response back from the server. The only
parameter to the callback is an IncomingMessage object that will be the response
from the server.
The following code shows the basics of implementing the ClientRequest object:
Click here to view code image
var http = require('http');
var options = {
hostname: 'www.myserver.com',
path: '/',
port: '8080',
method: 'POST'
};
var req = http.request(options, function(response){
var str = ''
response.on('data', function (chunk) {
str += chunk;
});
response.on('end', function () {
console.log(str);
});
});
req.end();
Table 7.2 Options that can be specified when creating a ClientRequest
Property
host
Description
hostname
Same as host but preferred over host to support url.parse()
port
Port of remote server. Defaults to 80.
localAddress
Local interface to bind for network connections.
socketPath
Unix Domain Socket (use one of host:port or socketPath
method
A string specifying the HTTP request method. For example, GET
POST, CONNECT, OPTIONS, etc. Defaults to GET.
path
A string specifying the requested resource path. Defaults to /. This
should also include the query string if any. For example:
The domain name or IP address of the server to issue the request to.
Defaults to localhost.
/book.html?chapter=12
headers
An object containing request headers. For example:
{ 'content-length': '750', 'content-type': 'text/plain' }
auth
Basic authentication in the form of user:password used to
compute an Authorization header.
agent
Defines the Agent behavior. When an Agent is used, request
defaults to Connection:keep-alive. Possible values are:
undefined (default): Uses global Agent.
Agent object: Uses specific Agent object.
false: Disables Agent behavior.
The ClientRequest object provides several events that enable you to handle the
various states the request may experience. For example, you can add a listener that is
called when the response event is triggered by the server’s response. Table 7.3
lists the events available on ClientResponse objects.
Table 7.3 Events available on ClientRequest objects
Property
Description
response Emitted when a response to this request is received from the server.
The callback handler receives back an IncomingMessage object
as the only parameter.
socket
Emitted after a socket is assigned to this request.
connect
Emitted every time a server responds to a request that was initiated
with a CONNECT method. If this event is not handled by the client,
then the connection will be closed.
upgrade
Emitted when the server responds to a request that includes an Update
request in the headers.
continue
Emitted when the server sends a 100 Continue HTTP response
instructing the client to send the request body.
In addition to events, the ClientRequest object also provides several methods
that can be used to write data to the request, abort the request, or end the request.
Table 7.4 lists the methods available on the ClientRequest object.
Table 7.4 Methods available on ClientRequest objects
Method
write(chunk,
[encoding])
Description
end([data],
[encoding])
Writes the optional data out to the request body and
then flushes the Writable stream and terminates
the request.
abort()
Aborts the current request.
setTimeout(timeout,
[callback])
Sets the socket timeout for the request.
setNoDelay
([noDelay])
Disables the Nagle algorithm, which buffers data
before sending it. The noDelay argument is a
Boolean that is true for immediate writes and
false for buffered writes.
setSocketKeepAlive
([enable],
[initialDelay])
Enables and disables the keep-alive functionality
on the client request. The enable parameter
defaults to false, which is disabled. The
initialDelay parameter specifies the delay
between the last data packet and the first keepalive request.
Writes a chunk, Buffer or String object, of body
data into the request. This allows you to stream data
into the Writable stream of the
ClientRequest object. If you stream the body
data, you should include the {'TransferEncoding', 'chunked'} header option when
you create the request. The encoding parameter
defaults to utf8.
The http.ServerResponse Object
The ServerResponse object is created by the HTTP server internally when a
request event is received. It is passed to the request event handler as the
second argument. You use the ServerRequest object to formulate and send a
response to the client.
The ServerResponse implements a Writable stream, so it provides all the
functionality of a Writable stream object. For example, you can use the
write() method to write to it as well as pipe a Readable stream into it to write
data back to the client.
When handling the client request, you use the properties, events, and methods of the
ServerResponse object to build and send headers, write data, and send the
response. Table 7.5 lists the event and properties available on the
ServerResponse object. Table 7.6 lists the methods available on the
ServerResponse object.
Table 7.5 Events available on ServerResponse objects
Property
close
Description
headersSent
A Boolean that is true if headers have been sent; otherwise,
false. This is read only.
sendDate
A Boolean that, when set to true, the Date header is
automatically generated and sent as part of the response.
statusCode
Allows you to specify the response status code without having to
explicitly write the headers. For example:
Emitted when the connection to the client is closed prior to
sending the response.end() to finish and flush the response.
response.statusCode = 500;
Table 7.6 Methods available on ServerResponse objects
Method
writeContinue()
Description
writeHead(statusCode,
[reasonPhrase],
[headers])
Writes a response header to the request. The
statusCode parameter is the three-digit HTTP
response status code, for example, 200, 401,
500. The optional reasonPhrase is a string
denoting the reason for the statusCode. The
headers are the response headers object, for
example:
Sends an HTTP/1.1 100 Continue message
to the client requesting that the body data be sent.
response.writeHead(200, 'Success', {
'Content-Length': body.length,
'Content-Type': 'text/plain' });
setTimeout(msecs,
callback)
Sets the socket timeout for the client connection in
milliseconds along with a callback function to
be executed if the timeout occurs.
setHeader(name,
value)
Sets the value of a specific header where name is
the HTTP header name and value is the header
value.
getHeader(name)
Gets the value of an HTTP header that has been
set in the response.
removeHeader(name)
Removes an HTTP header that has been set in the
response.
write(chunk,
[encoding])
Writes a chunk, Buffer or String object, of
data out to the response Writable stream. This
only writes data to the body portion of the
response. The default encoding is utf8. This
returns true if the data is written successfully or
false if the data is written to user memory. If it
returns false, then a drain event is emitted by
the Writable stream when the buffer is free
again.
addTrailers(headers)
Adds HTTP trailing headers to the end of the
response.
end([data],
[encoding])
Writes the optional data out to the response body
and then flushes the Writable stream and
finalizes the response.
The http.IncomingMessage Object
The IncomingMessage object is created either by the HTTP server or the HTTP
client. On the server side, the client request is represented by an
IncomingMessage object, and on the client side the server response is
represented by an IncomingMessage object. The IncomingMessage object
can be used for both because the functionality is basically the same.
The IncomingMessage implements a Readable stream, allowing you to read
the client request or server response as a streaming source. This means that the
readable and data events can be listened to and used to read data from the
stream.
In addition to the functionality provided by the Readable class, the
IncomingMessage object also provides the properties, events, and methods listed
in Table 7.7. These allow you to access information from the client request or server
response.
Table 7.7 Events, properties, and methods available on IncomingMessage
objects
Method/Event/Property Description
close
Emitted when the underlying socket is closed.
httpVersion
Specifies the version of HTTP used to build the client
request/response.
headers
This is an object containing the headers sent with the
request/response.
trailers
This is an object containing any trailer headers sent with
the request/response.
method
Specifies the method for the request/response. For
example: GET, POST, CONNECT.
url
The URL string sent to the server. This is the string that
can be passed to url.parse(). This attribute is only
valid in the HTTP server handling the client request.
statusCode
Specifies the three-digit status code from the server.
This attribute is only valid on the HTTP client when
handling a server response.
socket
This is a handle to the net.Socket object used to
communicate with the client/server.
setTimeout(msecs,
callback)
Sets the socket timeout for the connection in
milliseconds along with a callback function to be
executed if the timeout occurs.
The http.Server Object
The Node.js HTTP Server object provides the fundamental framework to
implement HTTP servers. It provides an underlying socket that listens on a port and
handles receiving requests and then sends responses out to client connections. While
the server is listening, the Node.js application will not end.
The Server object implements EventEmitter and emits the events listed in
Table 7.8. As you implement an HTTP server, you need to handle at least some or all
of these events. For example, at a minimum you need an event handler to handle the
request event that is triggered when a client request is received.
Table 7.8 Events that can be triggered by Server objects
Event
request
Description
Triggered each time the server receives a client request. The
callback should accept two parameters. The first is an
IncomingMessage object representing the client request,
and the second is a ServerResponse object you use to
formulate and send the response. For example:
function callback (request, response){}
connection
Triggered when a new TCP stream is established. The
callback receives the socket as the only parameter. For
example:
function callback (socket){}
close
Triggered when the server is closed. The callback receives no
parameters.
checkContinue
Triggered when a request that includes the Expect: 100continue header is received. There is a default event
handler that responds with an HTTP/1.1 100 Continue
even if you do not handle this event. For example:
function callback (request, response){}
connect
Emitted when an HTTP CONNECT request is received. The
callback receives the request, socket, and head, which is a
Buffer containing the first packet of the tunneling stream. For
example:
function callback (request, socket, head){}
upgrade
Emitted when the client requests an HTTP upgrade. If this
event is not handled clients sending an upgrade request will
have their connections closed. The callback receives the
request, socket, and head, which is a Buffer containing the
first packet of the tunneling stream. For example:
function callback (request, socket, head){}
clientError
Emitted when the client connection socket emits an error. The
callback receives an error as the first parameter and the socket
as the second. For example:
function callback (error, socket){}
To start the HTTP server, you need to first create a Server object using the
createServer() method shown below. This method returns the Server object.
The optional requestListener parameter is a callback that is executed when the
request event is triggered. The callback should accept two parameters. The first is an
IncomingMessage object representing the client request, and the second is a
ServerResponse object you use to formulate and send the response:
http.createServer([requestListener])
Once you have created the Server object, you can begin listening on it by calling
the listen() method on the Server object:
listen(port, [hostname], [backlog], [callback])
The first method listen(port, [hostname], [backlog],
[callback]) is the one that you will most likely use. The following list describes
each of the parameters:
port: Specifies the port to listen on.
hostname: Specifies when the hostname will accept connections, and if
omitted, the server will accept connections directed to any IPv4 address
(INADDR_ANY).
backlog: Specifies the maximum number of pending connections that are
allowed to be queued. This defaults to 511.
callback: Specifies the callback handler to execute once the server has begun
listening on the specified port.
The following code shows an example of starting an HTTP server and listening on
port 8080. Notice the request callback handler:
Click here to view code image
var http = require('http');
http.createServer(function (req, res) {
<>
}).listen(8080);
Two other methods can be used to listen for connections through the file system. The
first accepts a path to a file to listen on, and the second accepts an already open file
descriptor handle:
listen(path, [callback])
listen(handle, [callback])
To stop the HTTP server from listening once it has started, use the following
close() method:
close([callback]).
Implementing HTTP Clients and Servers in
Node.js
Now that you understand the ClientRequest, ServerResponse, and
IncomingMessage objects, you are ready to implement some Node.js HTTP
clients and servers. This section guides you through the process of implementing
basic HTTP clients and servers in Node.js. To do this, a client and server are
implemented in each section to show you how the two interact.
The examples in the following sections are basic to make it easy for you to grasp the
concepts of starting the client/server and then handling the different requests and
responses. There is no error handling, protection against attacks, or much of the other
functionality built in. However, the examples provide a good variety of the basic
flow and structure required to handle general HTTP requests using the http
module.
Serving Static Files
The most basic type of HTTP server is one that serves static files. To serve static
files from Node.js, you need to first start the HTTP server and listen on a port. Then
in the request handler, you open the file locally using the fs module and write the
file contents to the response.
Listing 7.1 shows the basic implementation of a static file server. Notice that line 5
creates the server using createServer() and also defines the request event
handler shown in lines 6–15. Also notice that the server is listening on port 8080 by
calling listen() on the Server object.
Inside the request event handler on line 6, the url.parse() method is used to
parse the url so that we can use the pathname attribute when specifying the path
for the file in line 7. The static file is opened and read using fs.readFile(), and
in the readFile() callback the contents of the file are written to the response
object using res.end(data) on line 14.
Listing 7.1 http_server_static.js: Implementing a basic static file
webserver
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
var fs = require('fs');
var http = require('http');
var url = require('url');
var ROOT_DIR = "html/";
http.createServer(function (req, res) {
var urlObj = url.parse(req.url, true, false);
fs.readFile(ROOT_DIR + urlObj.pathname, function (err,data) {
if (err) {
res.writeHead(404);
res.end(JSON.stringify(err));
return;
}
res.writeHead(200);
res.end(data);
});
}).listen(8080);
Listing 7.2 shows a basic implementation of an HTTP client that sends a get request
to the server to retrieve the file contents. Notice that the options for the request are
set in lines 2–6, and then the client request is initiated in lines 16–18 passing the
options.
When the request completes, the callback function uses the on('data') handler
to read the contents of the response from the server and then the on('end')
handler to log the file contents to a file. Figure 7.2 and Listing 7.2 Output show the
output of the HTTP client as well as accessing the static file from a web browser.
Listing 7.2 http_client_static.js: Basic web client retrieving static files
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
var http = require('http');
var options = {
hostname: 'localhost',
port: '8080',
path: '/hello.html'
};
function handleResponse(response) {
var serverData = '';
response.on('data', function (chunk) {
serverData += chunk;
});
response.on('end', function () {
console.log(serverData);
});
}
http.request(options, function(response){
handleResponse(response);
}).end();
Listing 7.2 Output Implementing a basic static file webserver
Click here to view code image
C:\books\node\ch07>node http_server_static.js
Static Example
Hello from a Static File
Figure 7.2 Implementing a basic static file web server
Implementing Dynamic GET Servers
More often than not you will use Node.js webservers to serve dynamic content rather
than static content. This content may be dynamic HTML files or snippets, JSON
data, or a number of other data types. To serve a GET request dynamically, you need
to implement code in the request handler that dynamically populates the data you
want to send back to the client, writes it out to the response, and then calls end() to
finalize the response and flush the Writable stream.
Listing 7.3 shows the basic implementation of a dynamic web service. In this case,
the web service simply responds with a dynamically generated HTTP file. The
example is designed to show the process of sending the headers, building the
response, and then sending the data in a series of write() requests.
Notice that line 6 creates the server using createServer(), and line 15 begins
listening on port 8080 using listen(). Inside the request event handler defined
in lines 7–15, the Content-Type header is set and then the headers are sent with a
response code of 200. In reality you would have already done a lot of processing to
prepare the data. But in this case, the data is just the messages array defined in
lines 2–5.
Notice that in lines 11–13 the loop iterates through the messages and calls write()
each time to stream the response to the client. Then in line 14 the response is
completed by calling end().
Listing 7.3 http_server_get.js: Implementing a basic GET webserver
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
var http = require('http');
var messages = [
'Hello World',
'From a basic Node.js server',
'Take Luck'];
http.createServer(function (req, res) {
res.setHeader("Content-Type", "text/html");
res.writeHead(200);
res.write('Simple HTTP Server');
res.write('');
for (var idx in messages){
res.write('\n
' + messages[idx] + '
');
}
res.end('\n');
}).listen(8080);
Listing 7.4 shows a basic implementation of an HTTP client that reads the response
from the server in Listing 7.3. This is similar to the example in Listing 7.2; however,
note that no path was specified since the service doesn’t really require one. For more
complex services, you would implement query strings or complex path routes to
handle a variety of calls.
Note that on line 11 the statusCode from the response is logged to the console.
Also on line12 the headers from the response are also logged. Then on line 13 the
full response from the server is logged. Figure 7.3 and Listing 7.4 Output show the
output of the HTTP client as well as accessing the dynamic get server from a web
browser.
Listing 7.4 http_client_get.js: Basic web client that makes a GET request
to the server in Listing 7.3
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
var options = {
hostname: 'localhost',
port: '8080',
};
function handleResponse(response) {
var serverData = '';
response.on('data', function (chunk) {
serverData += chunk;
});
response.on('end', function() {
console.log("Response Status:", response.statusCode);
console.log("Response Headers:", response.headers);
console.log(serverData);
});
}
http.request(options, function(response){
handleResponse(response);
}).end
Listing 7.4 Output Implementing a basic HTTP GET service
Click here to view code image
C:\books\node\ch07>node http_server_get.js
Response Status: 200
Response Headers: { 'content-type': 'text/html',
date: 'Mon, 26 Sep 2016 17:10:33 GMT',
connection: 'close',
'transfer-encoding': 'chunked' }
Simple HTTP Server
Hello World
From a basic Node.js server
Take Luck
Figure 7.3 Output of a basic HTTP GET server
Implementing POST Servers
Implementing a POST service is similar to implementing a GET server. In fact, you
may end up implementing them together in the same code for the sake of
convenience. POST services are handy if you need to send data to the server to be
updated, as for form submissions. To serve a POST request, you need to implement
code in the request handler that reads the contents of the post body out and processes
it.
Once you have processed the data, you dynamically populate the data you want to
send back to the client, write it out to the response, and then call end() to finalize
the response and flush the Writable stream. Just as with a dynamic GET server,
the output of a POST request may be a webpage, HTTP snippet, JSON data, or some
other data.
Listing 7.5 shows the basic implementation of a dynamic web service handling
POST requests. In this case, the web service accepts a JSON string from the client
representing an object that has name and occupation properties. The code in lines 4–
6 read the data from the request stream, and then in the event handler in lines 7–14,
the data is converted to an object and used to build a new object with message and
question properties. Then in line 14 the new object is stringified and sent back
to the client in the end() call.
Listing 7.5 http_server_post.js: Implementing a basic HTTP server that
handles HTTP POST requests
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
var http = require('http');
var options = {
host: '127.0.0.1',
path: '/',
port: '8080',
method: 'POST'
};
function readJSONResponse(response) {
var responseData = '';
response.on('data', function (chunk) {
responseData += chunk;
});
response.on('end', function () {
var dataObj = JSON.parse(responseData);
console.log("Raw Response: " +responseData);
console.log("Message: " + dataObj.message);
console.log("Question: " + dataObj.question);
});
}
var req = http.request(options, readJSONResponse);
req.write('{"name":"Bilbo", "occupation":"Burgler"}');
req.end();
Listing 7.6 shows a basic implementation of an HTTP client that sends JSON data to
the server as part of a POST request. The request is started in line 20. Then in line 21
a JSON string is written to the request stream, and line 22 finishes the request with
end().
Once the server sends the response back, the on('data') handler in lines 10–12
reads the JSON response. Then the on('end') handler in lines 13–18 parses the
response into a JSON object and outputs the raw response, message, and question.
Output 7.6 shows the output of the HTTP POST client.
Listing 7.6 http_client_post.js: Basic HTTP client that sends JSON data
to the server using POST and handles the JSON response
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
var http = require('http');
var options = {
host: '127.0.0.1',
path: '/',
port: '8080',
method: 'POST'
};
function readJSONResponse (response) {
var responseData = '';
response.on('data', function (chunk) {
responseData += chunk;
});
response.on('end', function () {
var dataObj = JSON.parse(responseData);
console.log("Raw Response: " +responseData);
console.log("Message: " + dataObj.message);
console.log("Question: " + dataObj.question);
});
}
var req = http.request(options, readJSONResponse);
req.write('{"name":"Bilbo", "occupation":"Burgler"}');
req.end();
Listing 7.6 Output Implementing an HTTP POST server serving JSON data
Click here to view code image
C:\books\node\ch07>node http_server_post.js
Raw Response: {"message":"Hello Bilbo","question":"Are you a good Burgler?
Message: Hello Bilbo
Question: Are you a good Burgler?
Interacting with External Sources
A common use of the HTTP services in Node.js is to access external systems to get
data to fulfill client requests. A variety of external systems provide data that can be
used in various ways. In this example, the code connects to the openweathermap.org
API to retrieve weather information about a city. To keep the example simple, the
output from openweathermap.org is pushed to the browser in a raw format. In reality,
you would likely massage the pieces of data needed into your own pages, widgets, or
data responses.
Listing 7.7 shows the implementation of the web service that accepts both GET and
POST requests. For the GET request, a simple webpage with a form is returned that
allows the user to post a city name. Then in the POST request the city name is
accessed, and the Node.js web client starts up and connects remotely to
openweathermap.org to retrieve weather information for that city. Then that info is
returned to the server along with the original web form.
The big difference between this example and the previous examples is that the
webserver also implements a local web client to connect to the external service and
get data used to formulate the response. The webserver is implemented in lines 35–
49. Notice that if the method is POST, we read the form data from the request stream
and use querystring.parse() to get the city name and call into the
getWeather() function.
The getWeather() function in lines 27–33 implements the client request to
openweathermap.org. Then the parseWeather() request handler in lines 17–25
reads the response from openweathermap.org and passes that data to the
sendResponse() function defined in lines 4–16 that formulates the response and
sends it back to the client. Figure 7.4 shows the implementation of the external
service in a web browser.
Note
You must go to http://openweathermap.org/ to create an account and get an API
key to use the following application.
Listing 7.7 http_server_external: Implementing an HTTP web service
that connects remotely to an external source for weather data
Click here to view code image
01
02
03
04
05
06
07
08
09
10
var http = require('http');
var url = require('url');
var qstring = require('querystring');
var APIKEY = ""//place your own api key within the quotes;
function sendResponse(weatherData, res){
var page = 'External Example' +
'' +
'';
if(weatherData){
page += '
Weather Info
' + weatherData +'
';
}
page += '';
res.end(page);
}
function parseWeather(weatherResponse, res) {
var weatherData = '';
weatherResponse.on('data', function (chunk) {
weatherData += chunk;
});
weatherResponse.on('end', function () {
sendResponse(weatherData, res);
});
}
function getWeather(city, res){
city = city.replace(' ', '-');
console.log(city);
var options = {
host: 'api.openweathermap.org',
path: '/data/2.5/weather?q=' + city + '&APPID=' + APIKEY
};
http.request(options, function(weatherResponse){
parseWeather(weatherResponse, res);
}).end();
}
http.createServer(function (req, res) {
console.log(req.method);
if (req.method == "POST"){
var reqData = '';
req.on('data', function (chunk) {
reqData += chunk;
});
req.on('end', function() {
var postParams = qstring.parse(reqData);
getWeather(postParams.city, res);
});
} else {
sendResponse(null, res);
}
}).listen(8080);
Figure 7.4 Implementing an external web service that connects to a remote source for
weather data
Implementing HTTPS Servers and Clients
Hypertext Transfer Protocol Secure (HTTPS) is a communications protocol that
provides secure communication between HTTP clients and servers. HTTPS is really
just HTTP running on top of the SSL/TLS protocol, which is where it gets its
security capabilities. HTTP provides security in two main ways. First, it uses longterm public and secret keys to exchange a short-term session key so that data can be
encrypted between client and server. Second, it provides authentication so that you
can ensure that the webserver you are connecting to is the one you actually think it
is, thus preventing man-in-the-middle attacks where requests are rerouted through a
third party.
The following sections discuss implementing HTTP servers and clients in your
Node.js environment using the https module. Before getting started using HTTPS,
you need to generate a private key and a public certificate. There are several ways to
do this, depending on your platform. One of the simplest methods is to use the
OpenSSL library for your platform.
To generate the private key, first execute the following OpenSSL command:
openssl genrsa -out server.pem 2048
Next, use the following command to create a certificate signing request file:
openssl req -new -key server.pem -out server.csr
Note
When creating the certificate signing request file, you will be asked several
questions. When prompted for the Common Name, you should put in the domain
name of the server you want to connect to. Otherwise, the certificate will not
work. Also you can put in additional domain names and IP addresses in the
Subject Alternative Names field.
Then to create a self-signed certificate that you can use for your own purpose or for
testing, use the following command:
openssl x509 -req -days 365 -in server.csr -signkey server.pem -out server.crt
Note
The self-signed certificate is fine for testing purposes or internal use. However, if
you are implementing an external web service that needs to be protected on the
Internet, you may want to get a certificate signed by a certificate authority. If you
want to create a certificate that is signed by a third-party certificate authority, you
need to take additional steps.
Creating an HTTPS Client
Creating an HTTPS client is almost exactly like the process of creating an HTTP
client discussed earlier in this chapter. The only difference is that there are additional
options, shown in Table 7.9, that allow you to specify the security options for the
client. The most important options you really need to worry about are key, cert,
and agent.
The key option specifies the private key used for SSL. The cert value specifies
the x509 public key to use. The global agent does not support options needed by
HTTPS, so you need to disable the agent by setting the agent to null, as shown
here:
var options = {
key: fs.readFileSync('test/keys/client.pem'),
cert: fs.readFileSync('test/keys/client.crt),
agent: false
};
You can also create your own custom Agent object that specifies the agent options
used for the request:
options.agent = new https.Agent (options);
Once you have defined the options with the cert, key, and agent settings, you
can call the https.request(options, [responseCallback]), and it
will work exactly the same as the http.request() call. The only difference is
that the data between the client and server is encrypted.
Click here to view code image
var options = {
hostname: 'encrypted.mysite.com',
port: 443,
path: '/',
method: 'GET',
key: fs.readFileSync('test/keys/client.pem'),
cert: fs.readFileSync('test/keys/client.crt),
agent: false
};
var req = https.request(options, function(res)) {
}
Table 7.9 Additional options for https.request() and
https.createServer()
Event
pfx
Description
key
A string or Buffer object containing the private key
to use for SSL.
passphrase
A string containing the passphrase for the private key
or pfx.
cert
A string or Buffer object containing the public x509
certificate to use.
ca
An Array of strings or Buffers of trusted
certificates in PEM format to check the remote host
against.
ciphers
A string describing the ciphers to use or exclude.
A string or Buffer object containing the private key,
certificate, and CA certs of the server in PFX or
PKCS12 format.
rejectUnauthorized
A Boolean that, when true, the server certificate is
verified against the list of supplied CAs. An error
event is emitted if verification fails. Verification
happens at the connection level, before the HTTP
request is sent. Defaults to true. Only for
https.request() options.
crl
Either a string or list of strings of PEM encoded CRLs
(Certificate Revocation List) only for
https.createServer().
secureProtocol
The SSL method to use. For example,
SSLv3_method to force SSL version 3.
Creating an HTTPS Server
Creating an HTTPS server is almost exactly like the process of creating an HTTP
server discussed earlier in this chapter. The only difference is that there are
additional options parameters that you must pass into
https.createServer(). The options, listed previously in Table 7.9, allow you
to specify the security options for the server. The most important options you really
need to worry about are key and cert.
The key option specifies the private key used for SSL. The cert value specifies
the x509 public key to use. The following shows an example of creating an HTTPS
server in Node.js:
Click here to view code image
var options = {
key: fs.readFileSync('test/keys/server.pem'),
cert: fs.readFileSync('test/keys/server.crt')
};
https.createServer(options, function (req, res) {
res.writeHead(200);
res.end("Hello Secure World\n");
}).listen(8080);
Once the HTTPS server has been created, the request/response handling works the
same way as for the HTTP servers described earlier in this chapter.
Summary
An important aspect of Node.js is the ability to quickly implement HTTP and
HTTPS servers and services. The http and https modules provide everything
you need to implement webserver basics. For your full webserver, you should use a
more extended library, such as Express. However, the http and https modules
work well for some basic web services and are simple to implement.
The examples in this chapter covered the HTTP basics to give you a good start on
implementing your own services. You also saw how the url and querystring
modules are used to parse URLs and query strings into objects and back.
Next
In the next chapter, you go a little deeper as the net module is discussed. You learn
how to implement your own socket services using TCP clients and servers.
8
Implementing Socket Services in
Node.js
An important part of backend services is the ability to communicate with each other
over sockets. Sockets allow one process to communicate with another process
through an IP address and port. This can be useful when implementing interprocess
communication (IPC) for two different processes running on the same server or
accessing a service running on a completely different server. Node.js provides the
net module that allows you to create both socket servers and clients that can
connect to socket servers. For secure connections, Node.js provides the tls module
that allows you to implement secure TLS socket servers and clients.
Understanding Network Sockets
Network sockets are endpoints of communication that flow across a computer
network. Sockets live below the HTTP layer and provide the actual point-to-point
communication between servers. Virtually all Internet communication is based on
Internet sockets that flow data between two points on the Internet.
A socket works using a socket address, which is a combination of an IP address and
port. There are two types of points in a socket connection: a server that listens for
connections and a client that opens a connection to the server. Both the server and
the client require a unique IP address and port combination.
The Node.js net module sockets communicate by sending raw data using the
Transmission Control Protocol (TCP). This protocol is responsible for packaging the
data and guaranteeing that it is sent from point to point successfully. Node.js sockets
implement the Duplex stream, which allows you to read and write streamed data
between the server and client.
Sockets are the underlying structure for the http module. If you do not need the
functionality for handling web requests like GET and POST and you just need to
stream data from point to point, then using sockets gives you a lighter weight
solution and a bit more control.
Sockets are also handy when communicating with other processes running on the
same computer. Processes cannot share memory directly, so if you want to access the
data in one process from another process, you can open up the same socket in each
process and read and write data between the two processes.
Understanding TPC Server and Socket
Objects
To use the net module in Node.js applications, you first need to understand the TCP
Server and Socket objects. These objects provide all the framework for starting
a TCP server to handle requests and implementing TCP socket clients to make
requests to the socket servers. Once you understand the events, properties, methods,
and behavior of these objects, it will be simple to implement your own TCP socket
servers and clients.
The following sections cover the purpose and behavior of the net.Socket and
net.Server objects. The most important events, properties, and methods that
each provides are also covered.
The net.Socket Object
Socket objects are created on both the socket server and the socket client and allow
data to be written and read back and forth between them. The Socket object
implements a Duplex stream, so it provides all the functionality that Writable
and Readable streams provide. For example, you can use the write()method to
stream writes of data to the server or client and a data event handler to stream data
from the server or client.
On the socket client, the Socket object is created internally when you call
net.connect() or net.createConnection(). This object is intended to
represent the socket connection to the server. You use the Socket object to monitor
the connection, send data to the server, and handle the response back from the server.
There is no explicit client object in the Node.js net module because the Socket
object acts as the full client allowing you to send/receive data and terminate the
connection.
On the socket server, the Socket object is created when a client connects to the
server and is passed to the connection event handler. This object is intended to
represent the socket connection to the client. On the server, you use the Socket
object to monitor the client connection as well as send and receive data to and from
the client.
To create a Socket object, you use one of the following methods. All the calls
return a Socket object. The only difference is the first parameters that they accept.
The final parameter for all of them is a callback function that is executed when a
connection is opened to the server. Notice that for each method there is a
net.connect() and a net.createConnection() form. These work
exactly the same way:
Click here to view code image
net.connect(options, [connectionListener])
net.createConnection(options, [connectionListener])
net.connect(port, [host], [connectListener])
net.createConnection(port, [host], [connectListener])
net.connect(path, [connectListener])
net.createConnection(path, [connectListener])
The first method to create a Socket object is to pass an options parameter,
which is an object that contains properties that define the socket connection. Table
8.1 lists the properties that can be specified when creating the Socket object. The
second method accepts port and host values, described in Table 8.1, as direct
parameters. The third option accepts a path parameter that specifies a file system
location that is a Unix socket to use when creating the Socket object.
Table 8.1 Options that can be specified when creating a Socket
Property
port
Description
host
Domain name or IP address of the server that the client should
Port number the client should connect to. This option is
required.
connect to. Defaults to localhost.
localAddress
Local IP address the client should bind to for network
connections.
localPort
The local port that it binds to for network connections.
family
Version of IP stack. (default: 4)
lookup
Custom lookup. (default: dns.lookup)
Once the Socket object is created, it provides several events that are emitted during
the life cycle of the connection to the server. For example, the connect event is
triggered when the socket connects, the data event is emitted when there is data in
the Readable stream ready to be read, and the close event is emitted when
connection to the server is closed. As you implement your socket server, you can
register callbacks to be executed when these events are emitted to handle opening
and closing the socket, reading and writing data, and so on. Table 8.2 lists the events
that can be triggered on the Socket object.
Table 8.2 Events that can be triggered on Socket objects
Event
connect
Description
data
Emitted when data is received on the socket. If no data event handler is
attached, then data can be lost. The callback function must accept a
parameter, which is a Buffer object containing the chunk of data
that was read from the socket. For example:
Emitted when a connection is successfully established with the server.
The callback function does not accept any parameters.
function(chunk){}
end
Emitted when the server terminates the connection by sending a FIN.
The callback function does not accept any parameters.
timeout
Emitted when the connection to the server times out due to inactivity.
drain
Emitted when the write buffer becomes empty. You can use this event
to throttle back the data stream being written to the socket. The
callback function does not accept any parameters.
error
Emitted when an error occurs on the socket connection. The callback
function should accept the error as the only argument. For example:
function(error){}
close
Emitted when the socket has fully closed, either because it was closed
by an end() or because an error occurred. The callback function does
not accept any parameters.
The Socket object also includes several methods that allow you to do things like
read from and write to the socket as well as pause or end data flow. Many of these
are inherited from the Duplex stream objects and should be familiar to you. Table
8.3 lists the methods available on the Socket object.
Table 8.3 Methods that can be called on Socket Objects
Method
setEncoding([encoding])
Description
write(data, [encoding],
[callback])
Writes a data Buffer or String to the Writable
of the socket using the encoding if specified. The callbac
function is executed as soon as the data is written.
end([data], [encoding])
Writes a data Buffer or String to the Writable
of the socket and then flushes the stream and closes the
connection.
destroy()
This forces the socket connection to shut down. You sho
only need to use this in the case of failures.
pause()
Pauses a Readable stream of the socket from emitting
data events. This allows you to throttle back the upload
data to the stream.
resume()
Resumes data event emitting on the Readable
the socket.
setTimeout(timeout,
[callback])
Specifies a timeout in milliseconds that the server wil
before emitting a timeout event when the socket is
inactive. The callback function will be triggered as a
once event listener. If you want the connection to be
When this function is called, data returned from the sock
streams is an encoded String instead of a Buffer
Sets the default encoding that should be used when w
data to and reading data from the streams. Using this opt
handles multi-byte characters that might otherwise be
mangled when converting the Buffer to a string using
buf.toString(encoding). If you want to read the
as strings, always use this method.
terminated on timeout, you should do it manually in the
callback function.
setNoDelay([noDelay])
Disables/enables the Nagle algorithm that buffers data be
sending it. Setting this to false disables data buffering
setKeepAlive([enable],
[initialDelay])
Enables/disables the keep-alive functionality on the
connection. The optional initialDelay parameter
specifies the amount in milliseconds that the socket is idl
before sending the first keep-alive packet.
address()
Returns the bound address, the address family name, and
port of the socket as reported by the operating system. Th
return value is an object that contains the port, family, an
address properties. For example:
{ port: 8107, family: 'IPv4', address: '127.0.0
unref()
Calling this method allows the Node.js application to
terminate if this socket is the only event on the event que
ref()
References this socket so that if this socket is the only th
on the event queue, the Node.js application will not term
The Socket object also provides several properties that you can access to get
information about the object. For example, the address and port the socket is
communicating on, the amount of data being written, and the buffer size. Table 8.4
lists the properties available on the Socket object.
Table 8.4 Properties that can be accessed on Socket Objects
Method
bufferSize
Description
remoteAddress
IP address of the remote server that the socket is connected to
remotePort
Port of the remote server that the socket is connected to
remoteFamily
IP of the remote family the socket is connected to
localAddress
Local IP address the remote client is using for the socket
connection
localPort
Local port the remote client is using for the socket connection
bytesRead
Number of bytes read by the socket
Returns the number of bytes currently buffered waiting to be
written to the socket’s stream
bytesWritten
Number of bytes written by the socket
To illustrate flowing data across a Socket object, the following code shows the
basics of implementing the Socket object on a client. Notice that the
net.connect() method is called using an options object containing a port
and a host attribute. The connect callback function logs a message and then
writes some data out to the server. To handle data coming back from the server, the
on.data() event handler is implemented. To handle the closure of the socket, the
on('end') event handler is implemented:
Click here to view code image
var net = require('net');
var client = net.connect({port: 8107, host:'localhost'}, function() {
console.log('Client connected');
client.write('Some Data\r\n');
});
client.on('data', function(data) {
console.log(data.toString());
client.end();
});
client.on('end', function() {
console.log('Client disconnected');
});
The net.Server Object
The net.Server object is used to create a TCP socket server and begin listening
for connections to which you will be able to read and write data. The Server object
is created internally when you call net.createServer(). This object represents
the socket server and handles listening for connections and then sending and
receiving data on those connections to the server.
When the server receives a connection, the Server creates a Socket object and
passes it to any connection event handlers that are listening. Because the Socket
object implements a Duplex stream, you can use the write()method to stream
writes of data back to the client and a data event handler to stream data from the
client.
To create a Server object, you use the net.createServer() method shown
here:
net.createServer([options], [connectionListener])
The options parameter is an object that specifies options to use when creating the
socket Server object. Table 8.5 lists the properties of the options object. The
second parameter is the connection event callback function, which is executed
when a connection is received. This connectionListener callback function is
passed to the Socket object for the connecting client.
Table 8.5 Options that can be specified when creating a net.Server
Property
allowHalfOpen
Description
pauseOnConnect
A Boolean; when true, each socket for each connection is
paused, and no data will be read from its handle. This allows
processes to pass connections between them without reading
any data. Defaults to false.
A Boolean; when true, the socket won’t automatically send
a FIN packet when the other end of the socket sends a FIN
packet, thus allowing half of the Duplex stream to remain
open. Defaults to false.
Once the Server object is created, it provides several events that are triggered
during the life cycle of the server. For example, the connection event is triggered
when a socket client connects, and the close event is triggered when the server
shuts down. As you implement your socket server, you can register callbacks to be
executed when these events are triggered to handle connections, errors, and
shutdown. Table 8.6 lists the events that can be triggered on the Socket object.
Table 8.6 Events that can be triggered on Socket objects
Event
listening
Description
connection
Emitted when a connection is received from a socket client. The
callback function must accept a parameter that is a Socket object
representing the connection to the connecting client. For example:
Emitted when the server begins listening on a port by calling the
listen() method. The callback function does not accept any
parameters.
function(client){}
close
Emitted when the server closes either normally or on error. This
event is emitted until all client connections have ended.
error
Emitted when an error occurs. The close event also is triggered
on errors.
The Server object also includes several methods that allow you to do things like
read from and write to the socket as well as pause or end data flow. Many of these
are inherited from the Duplex stream objects and should be familiar to you. Table
8.7 lists the methods available on the Socket object.
Table 8.7 Methods that can be called on Socket objects
Method
listen(port, [host],
[backlog], [callback])
Description
Opens up a port on the server and begins listening for
connections. port specifies the listening port. If you s
0 for the port, a random port number is selected.
IP address to listen on. If it is omitted, the server accep
connections directed to any IPv4 address. backlog
the maximum number of pending connections the serve
allow. The default is 511.
The callback function is called when the server has ope
the port and begins listening.
listen(path, [callback])
Same as the preceding method except that a Unix socke
server is started to listen for connections on the file
path specified.
listen(handle,
[callback])
Same as the preceding method except that a handle to a
Server or Socket object has an underlying _
member that points to a file descriptor handle on the
assumes that the file descriptor points to a socket file th
been bound to a port already.
getConnections(callback)
Returns the number of connections currently connected
server. The callback is executed when the number o
connections is calculated and accepts an error
and a count parameter. For example:
function(error, count)
close([callback])
Stops the server from accepting new connections. Curre
connections are allowed to remain until they complete.
server does not truly stop until all current connections h
been closed.
address()
Returns the bound address, the address family name, an
port of the socket as reported by the operating system. T
return value is an object that contains the port, family, a
address properties. For example:
{ port: 8107, family: 'IPv4', address: '127.0.
unref()
Calling this method allows the Node.js application to
terminate if this server is the only event on the event qu
ref()
References this socket so that if this server is the only t
on the event queue the Node.js application will not term
The Server object also provides the maxConnections attribute, which allows
you to set the maximum number of connections that the server will accept before
rejecting them. If a process has been forked to a child for processing using
child_process.fork(), you should not use this option.
The following code shows the basics of implementing the Server object. Notice
that the net.createServer() method is called and implements a callback that
accepts the client Socket object. To handle data coming back from the client, the
on.data() event handler is implemented. To handle the closure of the socket, the
on('end') event handler is implemented. To begin listening for connections, the
listen() method is called on port 8107:
Click here to view code image
var net = require('net');
var server = net.createServer(function(client) {
console.log(Client connected');
client.on('data', function(data) {
console.log('Client sent ' + data.toString());
});
client.on('end', function() {
console.log('Client disconnected');
});
client.write('Hello');
});
server.listen(8107, function() {
console.log('Server listening for connections');
});
Implementing TCP Socket Servers and Clients
Now that you understand the net.Server and net.Socket objects, you are
ready to implement some Node.js TCP clients and servers. This section guides you
through the process of implementing basic TCP clients and servers in Node.js.
The examples in the following sections are basic to make it easy for you to grasp the
concepts of starting the TCP server listening on a port, and then implementing clients
that can connect. The examples are designed to help you see the interactions and
event handling that need to be implemented.
Implementing a TCP Socket Client
At the most basic level, implementing a TCP socket client involves the process of
creating a Socket object that connects to the server, writing data to the server, and
then handling the data that comes back. Additionally, you should build the socket so
that it can also handle errors, the buffer being full, and timeouts. This section
discusses each of the steps to implement a socket client using the Socket object.
Listing 8.1 presents the full code for the following discussion.
The first step is to create the socket client by calling net.connect() as shown
below. Pass in the port and host that you want to connect to as well and
implement a callback function to handle the connect event:
net.connect({port: 8107, host:'localhost'}, function() {
//handle connection
});
Then inside the callback you should set up the connection behavior. For example,
you may want to add a timeout or set the encoding as shown here:
this.setTimeout(500);
this.setEncoding('utf8');
You also need to add handlers for the data, end, error, timeout, and close
events that you want to handle. For example, to handle the data event so that you
can read data coming back from the server, you might add the following handler
once the connection has been established:
this.on('data', function(data) {
console.log("Read from server: " + data.toString());
//process the data
this.end();
});
To write data to the server, you implement a write() command. If you are writing
a lot of data to the server and the write fails, then you may also want to implement a
drain event handler that begins writing again when the buffer is empty. The
following shows an example of implementing a drain handler because of a write
failure. Notice that a closure is used to preserve the values of the socket and
data variables once the function has ended.
Click here to view code image
function writeData(socket, data){
var success = !socket.write(data);
if (!success){
(function(socket, data){
socket.once('drain', function(){
writeData(socket, data);
});
})(socket, data);
}
}
Listing 8.1 shows the full implementation of a basic TCP socket client. All the client
does is send a bit of data to the server and receive a bit of data back; however, the
example could easily be expanded to support more complex data handling across the
socket. Notice that three separate sockets are opened to the server and are
communicating at the same time. Notice that each client created gets a different
random port number, as shown in Listing 8.1 Output.
Listing 8.1 socket_client.js: Implementing basic TCP socket clients
Click here to view code image
01 var net = require('net');
02 function getConnection(connName){
03
var client = net.connect({port: 8107, host:'localhost'}, function() {
04
console.log(connName + ' Connected: ');
05
console.log('
local = %s:%s', this.localAddress, this.localPort);
06
console.log('
remote = %s:%s', this.remoteAddress, this.remotePor
07
this.setTimeout(500);
08
this.setEncoding('utf8');
09
this.on('data', function(data) {
10
console.log(connName + " From Server: " + data.toString());
11
this.end();
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
});
this.on('end', function() {
console.log(connName + ' Client disconnected');
});
this.on('error', function(err) {
console.log('Socket Error: ', JSON.stringify(err));
});
this.on('timeout', function() {
console.log('Socket Timed Out');
});
this.on('close', function() {
console.log('Socket Closed');
});
});
return client;
}
function writeData(socket, data){
var success = !socket.write(data);
if (!success){
(function(socket, data){
socket.once('drain', function(){
writeData(socket, data);
});
})(socket, data);
}
}
var Dwarves = getConnection("Dwarves");
var Elves = getConnection("Elves");
var Hobbits = getConnection("Hobbits");
writeData(Dwarves, "More Axes");
writeData(Elves, "More Arrows");
writeData(Hobbits, "More Pipe Weed");
Listing 8.1 Output socket_client.js: Implementing basic TCP socket
clients
Click here to view code image
Elves Connected:
local = 127.0.0.1:62616
remote = 127.0.0.1:8107
Dwarves Connected:
local = 127.0.0.1:62617
remote = 127.0.0.1:8107
Hobbits Connected:
local = 127.0.0.1:62618
remote = 127.0.0.1:8107
Elves From Server: Sending: More Arrows
Dwarves From Server: Sending: More Axes
Hobbits From Server: Sending: More Pipe Weed
Dwarves Client disconnected
Socket Closed
Elves Client disconnected
Socket Closed
Hobbits Client disconnected
Socket Closed
Implementing a TCP Socket Server
At the most basic level, implementing a TCP server client involves the process of
creating a Server object, listening on a port, and then handling incoming
connections, including reading and writing data to and from the connections.
Additionally, the socket server should handle the close and error events on the
Server object as well as the events that occur in the incoming client connection
Socket object. This section discusses each of the steps to implement a socket
server using the Server object. Listing 8.2 presents the full code for the following
discussion.
The first step is to create the socket server by calling net.createServer() as
shown below. You also need to provide a connection callback handler and then call
listen() to begin listening on the port:
Click here to view code image
var server = net.createServer(function(client) {
//implement the connection callback handler code here.
});
server.listen(8107, function() {
//implement the listen callback handler here.
});
Inside the listen callback handler, you should also add handlers to support the
close and error events on the Server object. These may just be log statements,
or you may also want to add additional code that is executed when these events
occur. The follow shows the basic examples:
Click here to view code image
server.on('close', function(){
console.log('Server Terminated');
});
server.on('error', function(err){
});
Inside the connection event callback, you need to set up the connection behavior.
For example, you might want to add a timeout or set the encoding as shown here:
this.setTimeout(500);
this.setEncoding('utf8');
You also need to add handlers for the data, end, error, timeout, and close
events that you want to handle on the client connection. For example, to handle the
data event so that you can read data coming from the client, you might add the
following handler once the connection is established:
this.on('data', function(data) {
console.log("Received from client: " + data.toString());
//process the data
});
To write data to the server, you implement a write() command somewhere in
your code. If you are writing a lot of data to the client, then you may also want to
implement a drain event handler that begins writing again when the buffer is
empty. This can help if the write() returns a failure because the buffer is full, or if
you want to throttle back writing to the socket. The following shows an example of
implementing a drain handler because of a write failure. Notice that a closure is used
to preserve the values of the socket and data variables once the function has ended:
Click here to view code image
function writeData(socket, data){
var success = !socket.write(data);
if (!success){
(function(socket, data){
socket.once('drain', function(){
writeData(socket, data);
});
})(socket, data);
}
}
The code in Listing 8.2 shows the full implementation of a basic TCP socket server.
The socket server accepts connections on port 8107, reads the data in, and then
writes a string back to the client. Although the implementation is basic, it illustrates
handling the events as well as reading and writing data in the client connection.
Listing 8.2 socket_server.js: Implementing a basic TCP socket server
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
var net = require('net');
var server = net.createServer(function(client) {
console.log('Client connection: ');
console.log('
local = %s:%s', client.localAddress, client.localPort
console.log('
remote = %s:%s', client.remoteAddress, client.remoteP
client.setTimeout(500);
client.setEncoding('utf8');
client.on('data', function(data) {
console.log('Received data from client on port %d: %s',
client.remotePort, data.toString());
console.log(' Bytes received: ' + client.bytesRead);
writeData(client, 'Sending: ' + data.toString());
console.log(' Bytes sent: ' + client.bytesWritten);
});
client.on('end', function() {
console.log('Client disconnected');
server.getConnections(function(err, count){
console.log('Remaining Connections: ' + count);
});
});
client.on('error', function(err) {
console.log('Socket Error: ', JSON.stringify(err));
});
client.on('timeout', function() {
console.log('Socket Timed Out');
});
});
server.listen(8107, function() {
console.log('Server listening: ' + JSON.stringify(server.address()));
server.on('close', function(){
console.log('Server Terminated');
});
server.on('error', function(err){
console.log('Server Error: ', JSON.stringify(err));
});
});
function writeData(socket, data){
var success = !socket.write(data);
if (!success){
(function(socket, data){
socket.once('drain', function(){
writeData(socket, data);
});
})(socket, data);
}
46 }
Implementing TLS Servers and Clients
Transport Layer Security/Secure Socket Layer (TLS/SSL) is a cryptographic
protocol designed to provide secure communications on the Internet. They use X.509
certificates along with session keys to verify whether the socket server you are
communicating with is the one that you want to communicate with. TLS provides
security in two main ways. First, it uses long-term public and secret keys to
exchange a short-term session key so that data can be encrypted between client and
server. Second, it provides authentication so that you can ensure that the webserver
you are connecting to is the one you actually think it is, thus preventing man-in-themiddle attacks where requests are rerouted through a third party.
The following sections discuss implementing TLS socket servers and clients in your
Node.js environment using the tls module. Before getting started using TLS, you
need to generate a private key and public certificate for both your clients and your
server. There are several ways to do this depending on your platform. One of the
simplest methods is to use the OpenSSL library for your platform.
To generate the private key, first execute the following OpenSSL command:
openssl genrsa -out server.pem 2048
Next, use the following command to create a certificate signing request file:
openssl req -new -key server.pem -out server.csr
Note
When creating the certificate signing request file, you are asked several questions.
When prompted for the Common Name, you should put in the domain name of
the server you want to connect to. Otherwise, the certificate will not work. Also
you can put in additional domain names and IP addresses in the Subject
Alternative Names field.
Then to create a self-signed certificate that you can use for your own purpose or
testing, use the following command:
openssl x509 -req -days 365 -in server.csr -signkey server.pem -out server.crt
Note
The self-signed certificate is fine for testing purposes or internal use. However, if
you are implementing an external web service that needs to be protected on the
Internet, you may want to get a certificate signed by a certificate authority. If you
want to create a certificate that is signed by a third-party certificate authority, you
need to take additional steps.
Creating a TLS Socket Client
Creating a TLS client is almost exactly like the process of creating a socket client
discussed earlier in this chapter. The only difference is that there are additional
options, shown in Table 8.8, that allow you to specify the security options for the
client. The most important options you need to worry about are key, cert, and ca.
The key option specifies the private key used for SSL. The cert value specifies
the x509 public key to use. If you are using a self-signed certificate, you need to
point the ca property at the certificate for the server:
Click here to view code image
var options = {
key: fs.readFileSync('test/keys/client.pem'),
cert: fs.readFileSync('test/keys/client.crt'),
ca: fs.readFileSync('test/keys/server.crt')
};
Once you have defined the options with the cert, key, and ca settings, then you
can call the tls.connect(options, [responseCallback]), and it will
work exactly the same as the net.connect() call. The only difference is that the
data between the client and server is encrypted.
Click here to view code image
var options = {
hostname: 'encrypted.mysite.com',
port: 8108,
path: '/',
method: 'GET',
key: fs.readFileSync('test/keys/client.pem'),
cert: fs.readFileSync('test/keys/client.crt'),
ca: fs.readFileSync('test/keys/server.crt')
};
var req = tls.connect(options, function(res) {
})
Table 8.8 Additional options for tls.connect()
Event
pfx
Description
key
A string or Buffer object containing the private key
to use for SSL.
passphrase
A string containing the passphrase for the private key
or pfx.
cert
A string or Buffer object containing the public x509
certificate to use.
ca
An array of strings or buffers of trusted certificates in
PEM format to check the remote host against.
rejectUnauthorized
A Boolean; when true, the server certificate is verified
against the list of supplied CAs. An error event is
emitted if verification fails. Verification happens at the
connection level, before the HTTP request is sent.
Defaults to true.
servername
Specifies the server name for the Server Name
Indication SNI TLS extension.
secureProtocol
Specifies the SSL method to use. For example,
SSLv3_method will force SSL version 3.
A string or Buffer object containing the private key,
certificate, and CA certs of the server in PFX or
PKCS12 format.
Creating a TLS Socket Server
Creating a TLS socket server is almost exactly like the process of creating a socket
server discussed earlier in this chapter. The only differences are that there are
additional options parameters that you must pass into
https.createServer(), and there are some additional events that can be
triggered on the tls.Server object. The options, listed in Table 8.9, allow you to
specify the security options for the server. Table 8.10 lists the additional events for
the TLS socket server. The most important options you need to worry about are key,
cert, and ca.
The key option specifies the private key used for SSL. The cert value specifies
the x509 public key to use. If you are using a self-signed certificate, you need to
point the ca property at the certificate for the client.
Table 8.9 Additional options for tls.createServer()
Event
pfx
Description
key
A string or Buffer object containing the private key
to use for SSL.
passphrase
A string containing the passphrase for the private key
or pfx.
cert
A string or Buffer object containing the public x509
certificate to use.
ca
An array of strings or buffers of trusted certificates in
PEM format to check the remote host against.
crl
Either a string or list of strings of PEM encoded CRLs
(Certificate Revocation Lists).
ciphers
A string describing the ciphers to use or exclude.
Using this in conjunction with the
honorCipherOrder is a good way to prevent
BEAST attacks.
handshakeTimeout
Specifies the number of milliseconds to wait before
aborting the connection if the SSL/TLS handshake
does not finish. If the timeout is hit, a clientError
is emitted on the tls.Server.
honorCipherOrder
A Boolean; when true, the server honors the server’s
preferences over the client’s when choosing a cipher.
requestCert
When true, the server requests a certificate from
clients that connect and attempt to verify that
certificate. Default is false.
rejectUnauthorized
When true, the server rejects any connection that is
not authorized by the list of supplied CAs. This option
only has an effect if requestCert is true. Default
A string or Buffer object containing the private key,
certificate, and CA certs of the server in PFX or
PKCS12 format.
is false.
NPNProtocols
An Array or Buffer of possible NPN protocols.
Protocols should be ordered by their priority.
SNICallback
A function that is called if the client supports the SNI
TLS extension. The server name is the only argument
passed to the callback.
sessionIdContext
A string containing an opaque identifier for session
resumption. If requestCert is true, the default is
an MD5 hash value generated from the command line.
Otherwise, the default is not provided.
secureProtocol
Specifies the SSL method to use. For example,
SSLv3_method will force SSL version 3.
The following shows an example of creating a TLS socket server in Node.js:
Click here to view code image
var options = {
key: fs.readFileSync('test/keys/server.pem'),
cert: fs.readFileSync('test/keys/server.crt'),
ca: fs.readFileSync('test/keys/client.crt')
};
tls.createServer(options, function (client) {
client.write("Hello Secure World\r\n");
client.end();
}).listen(8108);
Once the TLS socket server has been created, the request/response handling works
basically the same way that the TCP socket servers described earlier in this chapter
work. The server can accept connections and read and write data back to the client.
Table 8.10 Additional events on TLS Server objects
Event
secureConnection
Description
Emitted when a new secure connection has been
successfully established. The callback accepts a single
instance of a tls.CleartextStream streaming
object that can be written to and read from. For example:
function (clearStream)
clientError
Emitted when a client connection emits an error. The
parameters to the callback are the error and a
tls.SecurePair object. For example:
function (error, securePair)
newSession
Emitted when a new TLS session is created. The callback
is passed the sessionId and sessionData
parameters containing the session information. For
example:
function (sessionId,
resumeSession
sessionData)
Emitted when the client tries to resume a previous TLS
session. You can store the session in an external storage
so that you can look it up when receiving this event. The
callback handler receives two parameters. The first is a
sessionId, and the second is a callback to be
executed if the session cannot be established. For
example:
function (sessionId,
callback)
Summary
Sockets are useful when implementing backend services in a Node.js application.
They allow a service on one system to communicate with a service on another
system through an IP address and port. They also provide the ability to implement an
IPC between two different processes running on the same server. The net module
allows you to create Server objects that act as socket servers and Socket objects
that act as socket clients. Since the Socket object extends Duplex streams, you
can read and write data from both the server and the client. For secure connections,
Node.js provides the tls module that allows you to implement secure TLS socket
servers and clients.
Next
In the next chapter, you learn how to implement multiprocessing in a Node.js
environment. This allows you to farm work out to other processes on the system to
take advantage of multiprocessor servers.
9
Scaling Applications Using Multiple
Processors in Node.js
In Chapter 4, “Using Events, Listeners, Timers, and Callbacks in Node.js,” you
learned that Node.js applications run on a single thread rather than multiple threads.
Using the single thread for application processing makes Node.js processes more
efficient and faster. But most servers have multiple processors, and you can scale
your Node.js applications by taking advantage of them. Node.js allows you to fork
work from the main application to separate processes that can then be processed in
parallel with each other and the main application.
To facilitate using multiple processes Node.js provides three specific modules. The
process module provides access to the running processes. The child_process
module provides the ability to create child processes and communicate with them.
The cluster module implements clustered servers that share the same port, thus
allowing multiple requests to be handled simultaneously.
Understanding the Process Module
The process module is a global object that can be accessed from your Node.js
applications without the need to use a require(). This object gives you access to
the running processes as well as information about the underlying hardware
architecture.
Understanding Process I/O Pipes
The process module provides access to the standard I/O pipes for the process
stdin, stdout, and stderr. stdin is the standard input pipe for the process,
which is typically the console. You can read input from the console using the
following code:
process.stdin.on('data', function(data){
console.log("Console Input: " + data);
});
When you type in data to the console and press Enter, the data is written back out.
For example:
some data
Console Input: some data
The stdout and stderr attributes of the process module are Writable
streams that can be treated accordingly.
Understanding Process Signals
A great feature of the process module is that it allows you to register listeners to
handle signals sent to the process from the OS. This is helpful when you need to
perform certain actions, such as clean up before a process is stopped or terminated.
Table 9.1 lists the process events that you can add listeners for.
To register for a process signal, simply use the on(event, callback) method.
For example, to register an event handler for the SIGBREAK event, you would use
the following code:
process.on('SIGBREAK', function(){
console.log("Got a SIGBREAK");
});
Table 9.1 Events that can be sent to Node.js processes
Event
Description
SIGUSR1
Emitted when the Node.js debugger is started. You can add a listener;
however, you cannot stop the debugger from starting.
SIGPIPE
Emitted when the process tries to write to a pipe without a process
connected on the other end.
SIGHUP
Emitted on Windows when the console window is closed, and on
other platforms under various similar conditions. Note: Windows
terminates Node.js about 10 seconds after sending this event.
SIGTERM
Emitted when a request is made to terminate the process. This is not
supported on Windows.
SIGINT
Emitted when a Break is sent to the process. For example, when
Ctrl+C is pressed.
SIGBREAK
Emitted on Windows when Ctrl+Break is pressed.
SIGWINCH
Emitted when the console has been resized. On Windows, this is
emitted only when you write to the console, when the cursor is being
moved, or when a readable TTY is used in raw mode.
SIGKILL
Emitted on a process kill. Cannot have a listener installed.
SIGSTOP
Emitted on a process stop. Cannot have a listener installed.
Controlling Process Execution with the process Module
The process module also gives you some control over the execution of processes,
specifically, the ability to stop the current process, kill another process, or schedule
work to run on the event queue. These methods are attached directly to the
process module. For example, to exit the current Node.js process, you would use:
process.exit(0)
Table 9.2 lists the available process control methods on the process module.
Table 9.2 Methods that can be called on the process module to affect process
execution
Method
Description
abort()
Causes the current Node.js application to emit an
abort event, exit, and generate a memory core.
exit([code])
Causes the current Node.js application to exit and
return the specified code.
kill(pid,
[signal])
Causes the OS to send a kill signal to the process with
the specified pid. The default signal is SIGTERM,
but you can specify another.
nextTick(callback)
Schedules the callback function on the Node.js
application’s queue.
Getting Information from the process Module
The process module provides a wealth of information about the running process
and the system architecture. This information can be useful when implementing your
applications. For example, the process.pid property gives you the process ID
that can then be used by your application.
Table 9.3 lists the properties and methods that you can access from the process
module and describes what they return.
Table 9.3 Methods that can be called on the process module to gather
information
Method
Description
version
Specifies the version of Node.js.
versions
Provides an object containing the required modules and version fo
config
Contains the configuration options used to compile the current no
argv
Contains the command arguments used to start the Node.js applic
is the path to the main JavaScript file.
execPath
Specifies the absolute path where Node.js was started from.
execArgv
Specifies the node-specific command-line options used to start the
chdir(directory)
Changes the current working directory for the application. Th
loaded after the application has started.
cwd()
Returns the current working directory for the process.
env
Contains the key/value pairs specified in the environment for the
pid
Specifies the current process’s ID.
title
Specifies the title of the currently running process.
arch
Specifies the processor architecture the process is running on (for
platform
Specifies the OS platform (for example, linux, win32, or
memoryUsage()
Describes the current memory usage of the Node.js process. You
object. For example:
console.log(util.inspect(process.memoryUsage()));{ rss:
maxTickDepth
Specifies the maximum number of events schedule by nextTick
from being processed. You should adjust this value as necessary t
uptime()
Contains the number of seconds the Node.js processor has been
hrtime()
Returns a high-resolution time in a tuple array [seconds,
implement a granular timing mechanism.
getgid()
On POSIX platforms, returns the numerical group ID for this proc
setgid(id)
On POSIX platforms, sets the numerical group ID for this process
getuid()
On POSIX platforms, returns the numerical or string user ID for t
setuid(id)
On POSIX platforms, sets the numerical or string user ID for this
getgroups()
On POSIX platforms, returns an array of group IDs.
setgroups(groups)
On POSIX platforms, sets the supplementary group IDs. Your No
initgroups(user,
extra_group)
On POSIX platforms, initializes the group access list with the info
needs root privileges to call this method.
To help you understand accessing information using the process module, Listing
9.1 makes a series of calls and outputs the results to the console, as shown in Listing
9.1 Output.
Listing 9.1 process_info.js: Accessing information about the process and
system using the process module
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
var util = require('util');
console.log('Current directory: ' + process.cwd());
console.log('Environment Settings: ' + JSON.stringify(process.env));
console.log('Node Args: ' + process.argv);
console.log('Execution Path: ' + process.execPath);
console.log('Execution Args: ' + JSON.stringify(process.execArgv));
console.log('Node Version: ' + process.version);
console.log('Module Versions: ' + JSON.stringify(process.versions));
//console.log(process.config);
console.log('Process ID: ' + process.pid);
console.log('Process Title: ' + process.title);
console.log('Process Platform: ' + process.platform);
console.log('Process Architecture: ' + process.arch);
console.log('Memory Usage: ' + util.inspect(process.memoryUsage()));
var start = process.hrtime();
setTimeout(function() {
var delta = process.hrtime(start);
console.log('High-Res timer took %d seconds and %d nanoseconds', delt
console.log('Node has been running %d seconds', process.uptime());
}, 1000);
Listing 9.1 Output Accessing information about the process and system using
the process module
Click here to view code image
Current directory: C:\Users\CalebTZD\workspace\node\code\ch09
Environment Settings:
Node Args: C:\Program Files\nodejs\node.exe,C:\Users\CalebTZD\workspace\no
Execution Path: C:\Program Files\nodejs\node.exe
Execution Args: []
Node Version: v7.8.0
Module Versions: Node Config:
Process ID: 12896
Process Title: C:\Program Files\nodejs\node.exe
Process Platform: win32
Process Architecture: x64
Memory Usage: { rss: 20054016,
heapTotal: 5685248,
heapUsed: 3571496,
external: 8772 }
High-Res timer took 1 seconds and 913430 nanoseconds
Node has been running 1.123 seconds
Implementing Child Processes
To take advantage of multiple processors in a server with your Node.js applications,
you need to farm work off to child processes. Node.js provides the
child_process module that allows you to spawn, fork, and execute work on
other processes. The following sections discuss the process of executing tasks on
other processes.
Keep in mind that child processes do not have direct access to the global memory in
each other or the parent process. Therefore, you need to design your applications to
run in parallel.
Understanding the ChildProcess Object
The child_process module provides a new class called ChildProcess that
acts as a representation of the child processes that can be accessed from the parent.
This allows you to control, end, and send messages to the child processes from the
parent process that started them.
The process module is a ChildProcess object as well. This means that when
you access process from the parent module, it is the parent ChildProcess
object, but when you access process from the child process, it is the
ChildProcess object.
The purpose of this section is to familiarize you with the ChildProcess object so
that in subsequent sections you can actually implement multiprocess Node.js
applications. The best way to do that is to learn about the events, attributes, and
methods of the ChildProcess object.
Table 9.4 lists the events that can be emitted on the ChildProcess object. You
implement handlers for the events to handle when the child process terminates or
sends messages back to the parent.
Table 9.4 Events that can be emitted on ChildProcess objects
Event
message
Description
Emitted when a ChildProcess object calls the send() method
to send data. Listeners on this event implement a callback that
can then read the data sent. For example:
child.on('send': function(message){console.log(message});
error
Emitted when an error occurs in the worker. The handler receives
an error object as the only parameter.
exit
Emitted when a worker process ends. The handler receives two
arguments, code and signal, that specify the exit code and the
signal passed to kill the process if it was killed by the parent.
close
Emitted when all the stdio streams of a worker process have
terminated. Different from exit because multiple processes might
share the same stdio streams.
disconnect
Emitted when disconnect() is called on a worker.
Table 9.5 lists the methods that can be called on the child process. These methods
allow you to terminate, disconnect, or send messages to the child process. For
example, the following code can be called from the parent process to send an object
to the child process:
child.send({cmd: 'command data'});
Table 9.5 Methods that can be called on ChildProcess objects
Method
kill([signal])
Description
send(message,
[sendHandle])
Sends a message to the handle. The message can be a string
or an object. The optional sendHandle parameter allows
you to send a TCP Server or Socket object to the client.
This allows the client process to share the same port and
address.
disconnect()
Closes the IPC channel between the parent and child and sets
the connected flag to false in both the parent and child
processes.
Causes the OS to send a kill signal to the child process. The
default signal is SIGTERM, but you can specify another. See
Table 9.1 for a list of signal strings.
Table 9.6 lists the properties that you can access on a ChildProcess object.
Table 9.6 Properties that can be accessed on ChildProcess objects
Property
stdin
Description
stdout
A standard output Readable stream.
stderr
A standard output Readable stream for errors.
pid
An ID of the process.
connected
A Boolean that is set to false after disconnect() is called.
When this is false, you can no longer send() messages to the
child.
An input Writable stream.
Executing a System Command on Another Process Using
exec()
The simplest method of adding work to another process from a Node.js process is to
execute a system command in a subshell using the exec() function. The exec()
function can execute just about anything that can be executed from a console
prompt; for example, a binary executable, shell script, Python script, or batch file.
When executed, the exec() function creates a system subshell and then executes a
command string in that shell just as if you had executed it from a console prompt.
This has the advantage of being able to leverage the capabilities of a console shell,
such as accessing environment variables on the command line.
The syntax for the exec() function call is shown below. The execFile()
function call returns a ChildProcess object:
child_process.exec(command, [options], callback)
The command parameter is a string that specifies the command to execute in the
subshell. The options parameter is an object that specifies settings to use when
executing the command, such as the current working directory. Table 9.7 lists the
options that can be specified by the exec() command.
The callback parameter is a function that accepts three parameters: error,
stdout, and stderr. The error parameter is passed an error object if an error is
encountered when executing the command. stdout and stderr are Buffer
objects that contain the output from executing the command.
Table 9.7 Options that can be set when using the exec() and execFile()
Functions
Property
Description
cwd
Specifies the current working directory for the child process to
execute within.
env
Object whose property:value pairs are used as environment
key/value pairs.
encoding
Specifies the encoding to use for the output buffers when storing
output from the command.
maxBuffer
Specifies the size of the output buffers for stdout and stderr.
The default value is 200*1024.
timeout
Specifies the number of milliseconds for the parent process to wait
before killing the child process if it has not completed. The default
is 0, which means there is no timeout.
killSignal
Specifies the kill signal to use when terminating the child process.
The default is SIGTERM.
The code in Listing 9.2 illustrates an example of executing a system command using
the exec() function. Listing 9.2 Output shows the result.
Listing 9.2 child_exec.js: Executing a system command in another process
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
var childProcess = require('child_process');
var options = {maxBuffer:100*1024, encoding:'utf8', timeout:5000};
var child = childProcess.exec('dir /B', options,
function (error, stdout, stderr) {
if (error) {
console.log(error.stack);
console.log('Error Code: '+error.code);
console.log('Error Signal: '+error.signal);
}
console.log('Results: \n' + stdout);
if (stderr.length){
console.log('Errors: ' + stderr);
}
});
child.on('exit', function (code) {
console.log('Completed with code: '+code);
});
Listing 9.2 Output child_exec.js: Executing a system command in another
process
Click here to view code image
Completed with code: 0
Results:
chef.js
child_fork.js
child_process_exec.js
child_process_exec_file.js
child_process_spawn.js
cluster_client.js
cluster_server.js
cluster_worker.js
file.txt
process_info.js
Executing an Executable File on Another Process Using
execFile()
Another simple method of adding work to another process from a Node.js process is
to execute an executable file on another process using the execFile() function.
This is similar to using exec() except that no subshell is used. This makes
execFile() lighter weight, but it also means that the command to execute must
be a binary executable. Shell scripts on Linux and batch files on Windows do not
work with the execFile() function.
The syntax for the execFile() function call is shown below. The execFile()
function returns a ChildProcess object:
child_process.execFile(file, args, options, callback)
The file parameter is a string that specifies the path to the executable file that will
be executed. The args parameter is an array that specifies command-line arguments
to be passed to the executable. The options parameter is an object that specifies
settings to use when executing the command, such as the current working directory.
Table 9.7 lists the options that can be specified by the execFile() command.
The callback parameter is a function that accepts three parameters: error,
stdout, and stderr. The error parameter is passed an error object if an error is
encountered when executing the command. stdout and stderr are Buffer
objects that contain the output from executing the command.
Listing 9.3 illustrates executing a system command using the execFile()
function. Listing 9.3 Output shows the output.
Listing 9.3 child_process_exec_file.js: Executing an executable file in
another process
Click here to view code image
01 var childProcess = require('child_process');
02 var options = {maxBuffer:100*1024, encoding:'utf8', timeout:5000};
03 var child = childProcess.execFile('ping.exe', ['-n', '1', 'google.com']
04
options, function (error, stdout, stderr) {
05
if (error) {
06
console.log(error.stack);
07
console.log('Error Code: '+error.code);
08
console.log('Error Signal: '+error.signal);
09
}
10
console.log('Results: \n' + stdout);
11
if (stderr.length){
12
console.log('Errors: ' + stderr);
13
}
14 });
15 child.on('exit', function (code) {
16
console.log('Child completed with code: '+code);
17 });
Listing 9.3 Output child_process_exec_file.js: Executing an
executable file in another process
Click here to view code image
Child completed with code: 0
Results:
Pinging google.com [216.58.195.78] with 32 bytes of data:
Reply from 216.58.195.78: bytes=32 time=47ms TTL=55
Ping statistics for 216.58.195.78:
Packets: Sent = 1, Received = 1, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 47ms, Maximum = 47ms, Average = 47ms
Spawning a Process in Another Node.js Instance Using
spawn()
A more complex method of adding work to another process from a Node.js process
is to spawn another process; link the stdio, stdout, and stderr pipes between
them; and then execute a file on the new process using the spawn() function. That
makes spawning a bit heavier than exec() but provides some great benefits.
The major differences between spawn() and exec()/execFile() are that the
stdin for the spawned process can be configured and the stdout and stderr
are Readable streams in the parent process. This means that exec() and
execFile() must complete before reading the buffer outputs. However, you can
read output data from a spawn() process as soon as it is written.
The syntax for the spawn() function call is shown below. The spawn() function
returns a ChildProcess object:
child_process.spawn(command, [args], [options])
The command parameter is a string that specifies the command to be executed. The
args parameter is an array that specifies command-line arguments to be passed to
the executable command. The options parameter is an object that specifies
settings to use when executing the command, such as the current working directory.
Table 9.8 lists the options that can be specified by the spawn() command.
The callback parameter is a function that accepts three parameters: error,
stdout, and stderr. The error parameter is passed an error object if an error is
encountered when executing the command. The stdout and stderr are defined
by the stdio option settings; by default they are Readable stream objects.
Table 9.8 Properties of the options parameter that can be set when using the
spawn() function
Property
cwd
Description
env
An object whose property:value pairs are used as environment
key/value pairs.
detached
A Boolean; when true, this child process is made the leader of a
new process group enabling the process to continue even when the
parent exits. You should also use child.unref() so that the
parent process does not wait for the child process before exiting.
uid
Specifies the user identity of the process for POSIX processes.
gid
Specifies the group identity of the process for POSIX processes.
stdio
An array that defines the child process stdio configuration
([stdin, stdout, stderr]). By default, Node.js opens file
descriptors [0, 1, 2] for [stdin, stdout, stderr]. The
strings define the configuration of each input and output stream. For
example:
['ipc', 'ipc', 'ipc']
A string representing the current working directory of the child
process.
The following list describes each of the options that can be used:
'pipe': Creates a pipe between the child and parent process. The
parent can access the pipe using ChildProcess.stdio[fd]
where fd is the file descriptors [0, 1, 2] for [stdin,
stdout, stderr].
'ipc': Creates an IPC channel for passing messages/file descriptors
between the parent and child using the send() method described
earlier.
'ignore': Does not set up a file descriptor in the child.
Stream object: Specifies a Readable or Writeable stream
object defined in the parent to use. The Stream object’s underlying
file descriptor is duplicated in the child and thus data can be streamed
from child to parent and vice versa.
File Descriptor Integer: Specifies the integer value of a file
descriptor to use.
null, undefined: Uses the defaults of [0, 1, 2] for the
[stdin, stdout, stderr] values.
Listing 9.4 illustrates executing a system command using the spawn() function.
Listing 9.4 Output shows the output.
Listing 9.4 child_process_spawn_file.js: Spawning a command in
another process
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
var spawn = require('child_process').spawn;
var options = {
env: {user:'brad'},
detached:false,
stdio: ['pipe','pipe','pipe']
};
var child = spawn('netstat', ['-e']);
child.stdout.on('data', function(data) {
console.log(data.toString());
});
child.stderr.on('data', function(data) {
console.log(data.toString());
});
child.on('exit', function(code) {
console.log('Child exited with code', code);
});
Listing 9.4 Output child_process_spawn_file.js: Spawning a
command in another process
Click here to view code image
Interface Statistics
Bytes
Unicast packets
Non-unicast packets
Received
893521612
780762
94176
Sent
951835252
5253654
31358
0
0
0
0
0
Child exited with code 0
Discards
Errors
Unknown protocols
Implementing Child Forks
Node.js also provides a specialized form of process spawning called a fork, which is
designed to execute Node.js module code inside another V8 instance running on a
separate processor. This has the advantage of allowing you to run multiple services
in parallel. However, it also takes time to spin up a new instance of V8, and each
instance takes about 10MB of memory. Therefore, you should design your forked
processes to be longer lived, and not require many of them. Remember that you
don’t get a performance benefit for creating more processes than you have CPUs in
the system.
Unlike spawn, you cannot configure the stdio for the child process; instead it is
expected that you use the send() mechanism in the ChildProcess object to
communicate between the parent and child processes.
The syntax for the fork() function call is shown below. The fork() function
returns a ChildProcess object:
child_process.fork(modulePath, [args], [options])
The modulePath parameter is a string that specifies the path to the JavaScript file
that is launched by the new Node.js instance. The args parameter is an array that
specifies command-line arguments to be passed to the node command. The
options parameter is an object that specifies settings to use when executing the
command, such as the current working directory. Table 9.9 lists the options that can
be specified by the fork() command.
The callback parameter is a function that accepts three parameters: error,
stdout, and stderr. The error parameter is passed an error object if an error is
encountered when executing the command. The stdout and stderr are
Readable stream objects.
Table 9.9 Properties of the options parameter that can be set when using the
fork() function
Property
cwd
Description
env
An object whose property:value pairs are used as environment
key/value pairs.
encoding
Specifies the encoding to use when writing data to the output streams
and across the send() IPC mechanism.
execPath
Specifies the executable to use to create the spawned Node.js process.
This allows you to use different versions of Node.js for different
processes, although that is not recommended in case the process
functionality is different.
silent
A Boolean; when true, the stdout and stderror in the forked
process are not associated with the parent process. The default is
false.
A string representing the current working directory of the child
process.
Listing 9.5 and Listing 9.6 illustrate examples of forking work off to another Node.js
instance running in a separate process. Listing 9.5 uses fork() to create three child
processes running the code from Listing 9.6. The parent process then uses the
ChildProcess objects to send commands to the child processes. Listing 9.6
implements the process.on('message') callback to receive messages from
the parent, and the process.send() method to send the response back to the
parent process, thus implementing the IPC mechanism between the two.
The output is shown in Listing 9.6 Output.
Listing 9.5 child_fork.js: A parent process creating three child processes
and sending commands to each, executing in parallel
Click here to view code image
01 var child_process = require('child_process');
02 var options = {
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
env:{user:'Brad'},
encoding:'utf8'
};
function makeChild(){
var child = child_process.fork('chef.js', [], options);
child.on('message', function(message) {
console.log('Served: ' + message);
});
return child;
}
function sendCommand(child, command){
console.log("Requesting: " + command);
child.send({cmd:command});
}
var child1 = makeChild();
var child2 = makeChild();
var child3 = makeChild();
sendCommand(child1, "makeBreakfast");
sendCommand(child2, "makeLunch");
sendCommand(child3, "makeDinner");
Listing 9.6 chef.js: A child process handling message events and sending
data back to the parent process
Click here to view code image
01 process.on('message', function(message, parent) {
02
var meal = {};
03
switch (message.cmd){
04
case 'makeBreakfast':
05
meal = ["ham", "eggs", "toast"];
06
break;
07
case 'makeLunch':
08
meal = ["burger", "fries", "shake"];
09
break;
10
case 'makeDinner':
11
meal = ["soup", "salad", "steak"];
12
break;
13
}
14
process.send(meal);
15 });
Listing 9.5 Output chef.js: A child process handling message events and
sending data back to the parent process
Click here to view code image
Requesting: makeBreakfast
Requesting: makeLunch
Requesting: makeDinner
Served: soup,salad,steak
Served: ham,eggs,toast
Served: burger,fries,shake
Implementing Process Clusters
One of the coolest things you can do with Node.js is create a cluster of Node.js
instances running in parallel in separate processes on the same machine. You can do
that using the techniques you learned about the in the previous section by forking
processes and then using the send(message, serverHandle) IPC
mechanism to communicate send messages and pass the underlying TCP server
handles between them. However, because that is such a common task, Node.js has
provided the cluster module that does all that for you automatically.
Using the Cluster Module
The cluster module provides the functionality necessary to easily implement a
cluster of TCP or HTTP servers running in different processes on the same machine
but still using the same underlying socket, thus handling requests on the same IP
address and port combination. The cluster module is simple to implement and
provides several events, methods, and properties that can be used to initiate and
monitor a cluster of Node.js servers.
Table 9.10 lists the events that can be emitted in a cluster application.
Table 9.10 Events that can be emitted by the cluster module
Event
fork
Description
Emitted when a new worker has been forked. The callback
function receives a Worker object as the only argument. For
example:
function (worker)
online
Emitted when the new worker sends back a message indicating
that it has started. The callback function receives a Worker
object as the only argument. For example:
function (worker)
listening
Emitted when the worker calls listen() to begin listening on
the shared port. The callback handler receives the worker
object as well as an address object indicating the port the
worker is listening on. For example:
function (worker, address)
disconnect
Emitted after the IPC channel has been disconnected, such as the
server calling worker.disconnect(). The callback
function receives a Worker object as the only argument. For
example:
function (worker)
exit
Emitted when the Worker object has disconnected. The
callback handler receives the worker, exit code, and
signal used. For example:
function (worker, code, signal)
setup
Emitted the first time the setupMaster() is called.
Table 9.11 lists the methods and properties available in the cluster module,
allowing you to get information such as whether this node is a worker or the master
as well as configuring and implementing the forked processes.
Table 9.11 Methods and properties of the cluster module
Property
settings
Description
isMaster
Is true if the current process is the cluster
master; otherwise, it is false.
isWorker
Is true if the current process is a worker;
otherwise, it is false.
Contains the exec, args, and silent
property values used to set up the cluster.
setupMaster([settings])
Accepts an optional settings object that
contains exec, args, and silent
properties. The exec property points to the
worker JavaScript file. The args property is
an array of parameters to pass, and silent
disconnects the IPC mechanism from the
worker thread.
disconnect([callback])
Disconnects the IPC mechanism from the
workers and closes the handles. The
callback function is executed when the
disconnect finishes.
worker
References the current Worker object in
worker processes. This is not defined in the
master process.
workers
Contains the Worker object, which you can
reference by ID from the master process. For
example:
cluster.workers[workerId]
Understanding the Worker Object
When a worker process is forked, a new Worker object is created in both the master
and worker processes. In the worker process, the object is used to represent the
current worker and interact with cluster events that are occurring. In the master
process, the Worker object is used to represent child worker processes so that your
master application can send messages to them, receive events on their state changes,
and even kill them.
Table 9.12 lists the events that Worker objects can emit.
Table 9.12 Events that can be emitted by Worker objects
Event
message
Description
Emitted when the worker receives a new message. The
callback function is passed the message as the only
parameter.
disconnect
Emitted after the IPC channel has been disconnected on this
worker.
exit
Emitted when this Worker object has disconnected.
error
Emitted when an error has occurred on this worker.
Table 9.13 lists the methods and properties available in the Worker object, allowing
you to get information such as whether this node is a worker or the master as well as
configuring and implementing the forked processes.
Table 9.13 Methods and properties of the Worker module
Property
id
Description
process
Specifies the ChildProcess object this worker is running
on.
suicide
Is set to true when kill() or disconnect() is called
on this worker. You can use this flag to determine whether
you should break out of loops to try and go down gracefully.
send(message,
[sendHandle])
Sends a message to the master process.
kill([signal])
Kills the current worker process by disconnecting the IPC
channel and then exiting. Sets the suicide flag to true.
disconnect()
When called in the worker, closes all servers, waits for the
close event, and then disconnects the IPC channel. When
called from the master, sends an internal message to the
worker causing it to disconnect itself. Sets the suicide
flag.
Represents the unique ID of this worker.
Implementing an HTTP Cluster
The best way to illustrate the value of the cluster module is to show a basic
implementation of Node.js HTTP servers. Listing 9.7 implements a basic cluster of
HTTP servers. Lines 4–13 register listeners for the fork, listening, and exit
events on cluster workers. Then in line 14 setupMaster() is called and the
worker executable cluster_worker.js is specified. Next, lines 15–19 create
the workers by calling cluster.fork(). Finally, in lines 20–24 the code iterates
through the workers and registers an on('message') event handler for each one.
Listing 9.8 implements the worker HTTP servers. Notice that the http server sends
back a response to the client and then also sends a message to the cluster master on
line 7.
Listing 9.9 implements a simple HTTP client that sends a series of requests to test
the servers created in Listing 9.8. The output of the servers is shown in Listing 9.7
and 9.8 Output, and the output of the clients is shown in Listing 9.9 Output. Notice
that Listing 9.9 Output shows that the requests are being handled by different
processes on the server.
Listing 9.7 cluster_server.js: A master process creating up to four
worker processes
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
var cluster = require('cluster');
var http = require('http');
if (cluster.isMaster) {
cluster.on('fork', function(worker) {
console.log("Worker " + worker.id + " created");
});
cluster.on('listening', function(worker, address) {
console.log("Worker " + worker.id +" is listening on " +
address.address + ":" + address.port);
});
cluster.on('exit', function(worker, code, signal) {
console.log("Worker " + worker.id + " Exited");
});
cluster.setupMaster({exec:'cluster_worker.js'});
var numCPUs = require('os').cpus().length;
for (var i = 0; i < numCPUs; i++) {
if (i>=4) break;
cluster.fork();
}
Object.keys(cluster.workers).forEach(function(id) {
cluster.workers[id].on('message', function(message){
console.log(message);
});
});
}
Listing 9.8 cluster_worker.js: A worker process implementing an HTTP
server
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
var cluster = require('cluster');
var http = require('http');
if (cluster.isWorker) {
http.Server(function(req, res) {
res.writeHead(200);
res.end("Process " + process.pid + " says hello");
process.send("Process " + process.pid + " handled request");
}).listen(8080, function(){
console.log("Child Server Running on Process: " + process.pid);
});
};
Listing 9.9 cluster_client.js: An HTTP client sending a series of
requests to test the server
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
var http = require('http');
var options = { port: '8080' };
function sendRequest(){
http.request(options, function(response){
var serverData = '';
response.on('data', function (chunk) {
serverData += chunk;
});
response.on('end', function () {
console.log(serverData);
});
}).end();
}
for (var i=0; i<5; i++){
console.log("Sending Request");
sendRequest();
}
Listing 9.7 and 9.8 Output cluster_server.js: A master process creating
up to four worker processes
Click here to view code image
Worker 1 created
Worker 2 created
Worker 3 created
Worker 4 created
Child Server Running on Process: 9012
Worker 1 is listening on null:8080
Child Server Running on Process: 1264
Worker 2 is listening on null:8080
Child Server Running on Process: 5488
Worker 4 is listening on null:8080
Child Server Running on Process: 7384
Worker 3 is listening on null:8080
Process 1264 handled request
Process 7384 handled request
Process 5488 handled request
Process 7384 handled request
Process 5488 handled request
Listing 9.9 Output cluster_client.js: An HTTP client sending a series of
requests to test the server
Click here to view code image
Sending
Sending
Sending
Sending
Sending
Process
Process
Process
Process
Process
Request
Request
Request
Request
Request
10108 says
12584 says
13180 says
10108 says
12584 says
hello
hello
hello
hello
hello
Summary
To make the most out of Node.js performance on servers with multiple processors,
you need to be able to farm work off to the other processes. The process module
allows you to interact with the system process, the child_process module
allows you to actually execute code on a separate process, and the cluster module
allows you to create a cluster of HTTP or TCP servers.
The child_process module provides the exec(), execFile(), spawn(),
and fork() functions, which are used to start work on separate processes. The
ChildProcess and Worker objects provide a mechanism to communicate
between the parent and child processes.
Next
In the next chapter, you are introduced to some other modules that Node.js provides
for convenience. For example, the os module provides tools to interact with the OS,
and the util module provides useful functionality.
10
Using Additional Node.js Modules
This chapter exposes you to some additional built-in capabilities of Node.js. The os
module provides operating system functionality that can be useful when
implementing your applications. The util module provides various functionality,
such as string formatting. The dns module provides the ability to perform DNS
lookups and reverse lookups from a Node.js application.
The following sections describe these modules and how to use them in your Node.js
applications. Some of the methods will already be familiar to you because you have
seen them in previous chapters.
Using the os Module
The os module provides a useful set of functions that allow you to get information
from the operating system (OS). For example, when accessing data from a stream
that comes from the OS, you can use the os.endianness() function to
determine whether the OS is big endian or little endian so that you can use the
correct read and write methods.
Table 10.1 lists the methods provided by the os module and describes how they are
used.
Table 10.1 Methods that can be called on the os module
Event
tmpdir()
Description
Returns a string path to the default temp directory for
the OS. Useful if you need to store files temporarily
and then remove them later.
endianness()
Returns BE or LE for big endian or little endian,
depending on the architecture of the machine.
hostname()
Returns the hostname defined for the machine. This
is useful when implementing network services that
require a hostname.
type()
Returns the OS type as a string.
platform()
Returns the platform as a string; for example,
win32, linux, or freeBSD.
arch()
Returns the platform architecture; for example, x86
or x64.
release()
Returns the OS version release.
uptime()
Returns a timestamp in seconds of how long the OS
has been running.
loadavg()
On UNIX-based systems, returns an array of values
containing the system load value for [1, 5, 15]
minutes.
totalmem()
Returns an integer specifying the system memory in
bytes.
freemem()
Returns an integer specifying the free system
memory in bytes.
cpus()
Returns an array of objects that describes the model,
speed, and times. This array contains the amount
of time the CPU has spent in user, nice, sys,
idle, and irq.
networkInterfaces()
Returns an array of objects describing the address
and family of addresses bound on each network
interface in your system.
EOL
Contains the appropriate End Of Line characters for
the operating system; for example, \n or \r\n. This
can be useful to make your application crossplatform compatible when processing string data.
To help you visualize using the os module, Listing 10.1 calls each of the os module
calls, and the output is shown in Listing 10.1 Output.
Listing 10.1 os_info.js: Calling methods on the os module
Click here to view code image
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
var os = require('os');
console.log("tmpdir :\t" + os.tmpdir());
console.log("endianness :\t" + os.endianness());
console.log("hostname :\t" + os.hostname());
console.log("type :\t\t" + os.type());
console.log("platform :\t" + os.platform());
console.log("arch :\t\t" + os.arch());
console.log("release :\t" + os.release());
console.log("uptime :\t" + os.uptime());
console.log("loadavg :\t" + os.loadavg());
console.log("totalmem :\t" + os.totalmem());
console.log("freemem :\t" + os.freemem());
console.log("EOL :\t" + os.EOL);
console.log("cpus :\t\t" + JSON.stringify(os.cpus()));
console.log("networkInterfaces : " +
JSON.stringify(os.networkInterfaces()));
Listing 10.1 Output Calling methods on the os module
tmpdir :
C:\Users\CalebTZD\AppData\Local\Temp
endianness :
LE
hostname :
DESKTOP-3I5OR8I
type :
Windows_NT
platform :
win32
arch :
x64
release :
10.0.14393
uptime :
1473719.6450068
loadavg :
0,0,0
totalmem :
12768796672
freemem :
8033443840
EOL :
cpus :
Using the util Module
The util module is a catch-all module that provides functions for formatting
strings, converting objects to strings, checking object types, performing synchronous
writes to output streams, and some object inheritance enhancements.
The following sections cover most of the functionality in the util module. They
also explain ways to use the util module in your Node.js applications.
Formatting Strings
When handling string data, it is important to be able to format the strings quickly.
Node.js provides a rudimentary string formatting method in the util module that
handles many string formatting needs. The util.format() function accepts a
formatter string as the first argument and returns a formatted string. The following
shows the syntax for the format() method, where format is the formatter string
and then [...] represents the following arguments:
util.format(format[...args])
The format argument is a string that can contain zero or more placeholders. Each
placeholder begins with a % character and is replaced with the converted string value
from its corresponding argument. The first formatter placeholder represents the
second argument and so on. The following is a list of supported placeholders:
%s: Specifies a string
%d: Specifies a number (can be integer or float)
%i: Specifies an integer
%f: Specifies a floating point value
%j: Specifies a JSON stringifyable object
%: If left empty afterward, does not act as a placeholder
The following is a list of things to keep in mind when using format():
When there are not as many arguments as placeholders, the placeholder is not
replaced. For example:
util.format('%s = %s', 'Item1'); // 'Item1:%s'
When there are more arguments than placeholders, the extra arguments are
converted to strings and concatenated with a space delimiter.
util.format('%s = %s', 'Item1', 'Item2', 'Item3'); //
'Item1 = Item2 Item3'
If the first argument is not a format string, then util.format()converts
each argument to a string, concatenates them together using a space delimiter,
and then returns the concatenated string. For example:
util.format(1, 2, 3); // '1 2 3'
Checking Object Types
It is often useful to determine whether an object you have received back from a
command is of a certain type. To do this, you can use the isinstanceof operator,
which compares the object types and returns true or false. For example:
([1,2,3] isinstanceof Array) //true
Converting JavaScript Objects to Strings
Often, especially when debugging, you need to convert a JavaScript object to a string
representation. The util.inspect() method allows you to inspect an object and
then return a string representation of the object.
The following shows the syntax for the inspect() method:
util.inspect(object, [options])
The object parameter is the JavaScript object you want to convert to a string. The
options method allows you to control certain aspects of the formatting process.
options can contain the following properties:
showHidden: When set to true, the non-enumerable properties of the object
are also converted to the string. Defaults to false.
depth: Limits the number of levels deep the inspect process traverses while
formatting properties that are also objects. This can prevent infinite loops and
also prevent instances where complex objects cost a lot of CPU cycles. Defaults
to 2; if it is null, it can recurse forever.
colors: When set to true, the output is styled with ANSI color codes.
Defaults to false.
customInspect: When set to false, any custom inspect() functions
defined on the objects being inspected are not called. Defaults to true.
You can attach your own inspect() function to the object, thus controlling the
output. The following code creates an object with first and last properties, but
inspect outputs only a name property:
Click here to view code image
var obj = { first:'Caleb', last:'Dayley' };
obj.inspect = function(depth) {
return '{ name: "' + this.first + " " + this.last + '" }';
};
console.log(util.inspect(obj));
//Outputs: { name: "Caleb Dayley" }
Inheriting Functionality from Other Objects
The util module provides the util.inherits() method to allow you to create
objects that inherit the prototype methods from another. When you create the
new object, the prototype methods are automatically used. You have already
seen this in a few examples in the book; for example, when implementing your own
custom Readable and Writable streams.
The following shows the format of the util.inherits() method:
util.inherits(constructor,
superConstructor)
The prototype of constructor is set to the prototype of superConstructor
and executed when a new object is created. You can access the
superConstructor from your custom object constructor using the
constructor.super_ property.
Listing 10.2 illustrates using inherits() to inherit the
events.EventEmitter object constructor to create a Writable stream.
Notice on line 11 the object is an instance of events.EventEmitter. Also
notice on line 12 the Writer.super_ value is eventsEmitter. The results are
shown in Listing 10.2 Output.
Listing 10.2 util_inherit.js: Using inherits() to inherit the prototypes
from events.EventEmitter
Click here to view code image
01 var util = require("util");
02 var events = require("events");
03 function Writer() {
04
events.EventEmitter.call(this);
05
06
07
08
09
10
11
12
13
14
15
16
}
util.inherits(Writer, events.EventEmitter);
Writer.prototype.write = function(data) {
this.emit("data", data);
};
var w = new Writer();
console.log(w instanceof events.EventEmitter);
console.log(Writer.super_ === events.EventEmitter);
w.on("data", function(data) {
console.log('Received data: "' + data + '"');
});
w.write("Some Data!");
Listing 10.2 Output util_inherit.js: Using inherits() to inherit the
prototypes from events.EventEmitter
Click here to view code image
true
true
Received data: "Some Data!"
Using the dns Module
If your Node.js application needs to resolve DNS domain names, look up domains,
or do reverse lookups, then the dns module is helpful. A DNS lookup contacts the
domain name server and requests records about a specific domain name. A reverse
lookup contacts the domain name server and requests the DNS name associated with
an IP address. The dns module provides functionality for most of the lookups that
you may need to perform. Table 10.2 lists the lookup calls and their syntax, and
describes how they are used.
Table 10.2 Methods that can be called on the dns Module
Event
lookup(domain,
[family], callback)
Description
Resolves the domain. The family attribute can
be 4, 6, or null, where 4 resolves into the first
found A (IPv4) record, 6 resolves into the first
round AAAA (IPv6) record, and null resolves
both. The default is null. The callback
function receives an error as the first argument
and an array of IP addresses as the second. For
example:
function (error, addresses)
resolve(domain,
[rrtype], callback)
Resolves the domain into an array of record types
specified by rrtype. rrtype can be
A: IPV4 addresses, the default
AAAA: IPV6 addresses
MX: Mail eXchange records
TXT: Text records
SRV: SRV records
PTR: Reverse IP lookups
NS: Name Server records
CNAME: Canonical Name records
The callback function receives an error as the
first argument and an array of IP addresses as
the second. For example:
function (error, addresses)
resolve4(domain,
callback)
Same as dns.resolve() except only for A
records.
resolve6(domain,
callback)
Same as dns.resolve() except only for AAAA
records.
resolveMx(domain,
callback)
Same as dns.resolve() except only for MX
records.
resolveTxt(domain,
callback)
Same as dns.resolve() except only for TXT
records.
resolveSrv(domain,
callback)
Same as dns.resolve() except only for SRV
records.
resolveNs(domain,
callback)
Same as dns.resolve() except only for NS
records.
resolveCname(domain,
callback)
Same as dns.resolve() except only for CNAME
records.
reverse(ip,
callback)
Does a reverse lookup on the ip address. The
callback function receives an error object if
one occurs and an array of domains if the lookup
is successful. For example:
function (error, domains)
Listing 10.3 illustrates performing lookups and reverse lookups. In line 3,
resolve4() is used to look up the IPv4 addresses, and then in lines 5–8,
reverse() is called on those same addresses and the reverse lookup performed.
Listing 10.3 Output shows the result.
Listing 10.3 dns_lookup.js: Performing lookups and then reverse lookups
on domains and IP addresses
Click here to view code image
01
02
03
04
05
06
07
08
09
10
var dns = require('dns');
console.log("Resolving www.google.com . . .");
dns.resolve4('www.google.com', function (err, addresses) {
console.log('IPv4 addresses: ' + JSON.stringify(addresses, false, ' '
addresses.forEach(function (addr) {
dns.reverse(addr, function (err, domains) {
console.log('Reverse for ' + addr + ': ' + JSON.stringify(domains
});
});
});
Listing 10.3 Output dns_lookup.js: Performing lookups and then reverse
lookups on domains and IP addresses
Click here to view code image
Resolving www.google.com . . .
IPv4 addresses: [
"172.217.6.68"
]
Reverse for 172.217.6.68: ["sfo07s17-in-f4.1e100.net","sfo07s17-in-f68.1e1
Using the crypto Module
The crypto module is interesting and fun to play around with. As the name
suggests, it creates cryptographic information, or in other words, creates secure
communication using secret code. To use crypto, you must make sure that it is
loaded into your Node project. Although cool, this module isn’t necessary, and a
Node application can be built without including support for crypto. The easiest
way to do ensure crypto is loaded is to use a simple try catch (err); for
example:
Click here to view code image
let crypto;
try {
crypto = require('crypto');
} catch (err) {
console.log('crypto support is disabled!');
}
The crypto module includes several classes that provide functionality to encrypt
and decrypt data and streams. Table 10.3 lists all the different classes that the
crypto module provides.
Table 10.3 Classes that can be used in the crypto module
Class
certificate
Description
cipher
Used to encrypt data in either a stream that is both readable
and writable, or using the cipher.update and
cipher.final methods.
decipher
The opposite of cipher. Used to decrypt data using either
a readable and writable stream or the
decipher.update and deciper.final methods.
diffieHellman
Used to create key exchanges for Diffie-Hellman (a
specific method for exchanging cryptographic keys).
eCDH
(Elliptical
Curve DiffieHellman)
Used to create key exchanges for ECDH (same as DiffieHellman, but the two parties use an elliptical curve publicprivate key pair).
hash
Used to create hash digests of data using a readable and
Used for working with SPKAC (a certificate signing
request mechanism) and primarily used to handle output of
HTML5.
writable stream or hash.update and hash.digest.
hmac
Used to create Hmac digests of data using a readable and
writable stream or Hmac.update and Hmac.digest.
sign
Used to generate signatures.
verify
Used in tandem with sign to verify the signatures.
The most common use for the crypto module is to use the Cipher and
Decipher classes to create encrypted data that can be stored and decrypted later;
for example, passwords. Initially, passwords are entered as text, but it would be
foolish to actually store them as text. Instead, passwords are encrypted using an
encryption algorithm such as the ('aes192') method. This allows you to store
data encrypted so if it is accessed without decrypting, your password is protected
from prying minds. Listing 10.4 shows an example of encrypting and decrypting a
password string. The output follows in Listing 10.4 Output.
Listing 10.4 encrypt_password.js: Using cipher and decipher to
encrypt and then decrypt data
Click here to view code image
var crypto = require('crypto');
var crypMethod = 'aes192';
var secret = 'MySecret';
function encryptPassword(pwd){
var cipher = crypto.createCipher(crypMethod, secret);
var cryptedPwd = cipher.update(pwd,'utf8','hex');
cryptedPwd += cipher.final('hex');
return cryptedPwd;
}
function decryptPassword(pwd){
var decipher = crypto.createDecipher(crypMethod, secret);
var decryptedPwd = decipher.update(pwd,'hex','utf8');
decryptedPwd += decipher.final('utf8');
return decryptedPwd;
}
var encryptedPwd = encryptPassword("BadWolf");
console.log("Encrypted Password");
console.log(encryptedPwd);
console.log("\nDecrypted Password");
console.log(decryptPassword(encryptedPwd));
Listing 10.4 Output Using cipher and decipher to encrypt and then decrypt
data
Click here to view code image
Encrypted Password
0ebc7d846519b955332681c75c834d50
Decrypted Password
BadWolf
Other Node Modules and Objects
This section lists some other Node modules and objects that would be beneficial for
you to know about:
Global: Object available throughout all the modules. Globals range anywhere
from _dirname, which gives you the name of the directory, to the Process
object.
V8: Module used to expose APIs, specifically for the version of V8 that is built
in to the Node binary.
Debugger: Module used to debug your Node application. To use, simply start
Node with the debug argument, like so: $ node debug myscript.js.
Assertion testing: Module that provides a basic set of assertion tests used to
test invariants.
C/C++ add-ons: Objects that allow you to dynamically link shared objects
written in C or C++. They provide an interface that both JavaScript in Node and
C/C++ libraries can use, allowing them to work as a regular Node.js
applications.
REPL (Read Event Print Loop): Accepts individual lines of input, evaluates
them using a user-defined function, and then outputs the results.
Summary
The os module allows you to get information about the system, including the
operating system type and version, the platform architecture, and programming
helps, such as the amount of free memory, temp folder location, and EOL characters.
The util module is the catch-all library for Node that has methods for synchronous
output, string formatting, and type checking. The dns module performs DNS
lookups and reverse lookups from a Node.js application. The crypto module
encrypts and decrypts data to secure private data.
Next
In the next chapter, you jump into the world of MongoDB. You learn the MongoDB
basics and how to implement it in the Node.js world.
Part III: Learning MongoDB
11
Understanding NoSQL and MongoDB
At the core of most large-scale web applications and services is a high-performance
data storage solution. The backend data store is responsible for storing everything
from user account information to shopping cart items to blog and comment data.
Good web applications must store and retrieve data with accuracy, speed, and
reliability. Therefore, the data storage mechanism you choose must perform at a
level that satisfies user demand.
Several different data storage solutions are available to store and retrieve data needed
by your web applications. The three most common are direct file system storage in
files, relational databases, and NoSQL databases. The data store chosen for this book
is MongoDB, which is a NoSQL database.
The following sections describe MongoDB and discuss the design considerations
you need to review before deciding how to implement the structure of data and
configuration of the database. The sections cover the questions to ask yourself, and
then cover the mechanisms built into MongoDB to satisfy the demands of the
answers to those questions.
Why NoSQL?
The concept of NoSQL (Not Only SQL) consists of technologies that provide
storage and retrieval without the tightly constrained models of traditional SQL
relational databases. The motivation behind NoSQL is mainly simplified designs,
horizontal scaling, and finer control of the availability of data.
NoSQL breaks away from the traditional structure of relational databases and allows
developers to implement models in ways that more closely fit the data flow needs of
their systems. This allows NoSQL databases to be implemented in ways that
traditional relational databases could never be structured.
There are several different NoSQL technologies, such as HBase’s column structure,
Redis’s key/value structure, and Neo4j’s graph structure. However, in this book
MongoDB and the document model were chosen because of great flexibility and
scalability when it comes to implementing backend storage for web applications and
services. Also MongoDB is one of the most popular and well supported NoSQL
databases currently available.
Understanding MongoDB
MongoDB is a NoSQL database based on a document model where data objects are
stored as separate documents inside a collection. The motivation of the MongoDB
language is to implement a data store that provides high performance, high
availability, and automatic scaling. MongoDB is simple to install and implement, as
you see in the upcoming chapters.
Understanding Collections
MongoDB groups data together through collections. A collection is simply a
grouping of documents that have the same or a similar purpose. A collection acts
similarly to a table in a traditional SQL database, with one major difference. In
MongoDB, a collection is not enforced by a strict schema; instead, documents in a
collection can have a slightly different structure from one another as needed. This
reduces the need to break items in a document into several different tables, which is
often done in SQL implementations.
Understanding Documents
A document is a representation of a single entity of data in the MongoDB database.
A collection is made up of one or more related objects. A major difference between
MongoDB and SQL is that documents are different from rows. Row data is flat,
meaning there is one column for each value in the row. However, in MongoDB,
documents can contain embedded subdocuments, thus providing a much closer
inherent data model to your applications.
In fact, the records in MongoDB that represent documents are stored as BSON,
which is a lightweight binary form of JSON, with field:value pairs
corresponding to JavaScript property:value pairs. These field:value pairs
define the values stored in the document. That means little translation is necessary to
convert MongoDB records back into the JavaScript object that you use in your
Node.js applications.
For example, a document in MongoDB may be structured similarly to the following
with name, version, languages, admin, and paths fields:
Click here to view code image
{
name: "New Project",
version: 1,
languages: ["JavaScript", "HTML", "CSS"],
admin: {name: "Brad", password: "****"},
paths: {temp: "/tmp", project: "/opt/project", html: "/opt/project/html"}
}
Notice that the document structure contains fields/properties that are strings,
integers, arrays, and objects, just like a JavaScript object. Table 11.1 lists the
different data types that field values can be set to in the BSON document.
The field names cannot contain null characters, . (dots), or $ (dollar signs). Also,
the _id field name is reserved for the Object ID. The _id field is a unique ID for
the system that is made up of the following parts:
A 4-byte value representing the seconds since the last epoch
A 3-byte machine identifier
A 2-byte process ID
A 3-byte counter, starting with a random value
The maximum size of a document in MongoDB is 16MB, which prevents queries
that result in an excessive amount of RAM being used or intensive hits to the file
system. Although you may never come close, you still need to keep the maximum
document size in mind when designing some complex data types that contain file
data.
MongoDB Data Types
The BSON data format provides several different types that are used when storing
the JavaScript objects to binary form. These types match the JavaScript type as
closely as possible. It is important to understand these types because you can actually
query MongoDB to find objects that have a specific property that has a value of a
certain type. For example, you can look for documents in a database whose
timestamp value is a String object or query for ones whose timestamp is a Date
object.
MongoDB assigns each of the data types an integer ID number from 1 to 255 that is
used when querying by type. Table 11.1 shows a list of the data types that MongoDB
supports along with the number MongoDB uses to identify them.
Table 11.1 MongoDB data types and corresponding ID number
Type
Number
Double
1
String
2
Object
3
Array
4
Binary data
5
Object id
7
Boolean
8
Date
9
Null
10
Regular Expression
11
JavaScript
13
JavaScript (with scope)
15
32-bit integer
16
Timestamp
17
64-bit integer
18
Decimal126
19
Min key
-1
Max key
127
Another thing to be aware of when working with different data types in MongoDB is
the order in which they are compared. When comparing values of different BSON
types, MongoDB uses the following comparison order from lowest to highest:
1. Min Key (internal type)
2. Null
3. Numbers (32-bit integer, 64-bit integer, Double)
4. String
5. Object
6. Array
7. Binary Data
8. Object ID
9. Boolean
10. Date, Timestamp
11. Regular Expression
12. Max Key (internal type)
Planning Your Data Model
Before you begin implementing a MongoDB database, you need to understand the
nature of the data being stored, how that data is going to get stored, and how it is
going to be accessed. Understanding these concepts allows you to make
determinations ahead of time and to structure the data and your application for
optimal performance.
Specifically, you should ask yourself the following questions:
What are the basic objects that my application will be using?
What is the relationship between the different object types: one-to-one, one-tomany, or many-to-many?
How often will new objects be added to the database?
How often will objects be deleted from the database?
How often will objects be changed?
How often will objects be accessed?
How will objects be accessed: by ID, property values, comparisons, and so on?
How will groups of object types be accessed: by common ID, common property
value, and so on?
Once you have the answers to these questions, you are ready to consider the structure
of collections and documents inside the MongoDB. The following sections discuss
different methods of document, collection, and database modeling you can use in
MongoDB to optimize data storage and access.
Normalizing Data with Document References
Data normalization is the process of organizing documents and collections to
minimize redundancy and dependency. This is done by identifying object properties
that are subobjects and should be stored as a separate document in another collection
from the object’s document. Typically, this is used for objects that have a one-tomany or many-to-many relationship with subobjects.
The advantage of normalizing data is that the database size will be smaller because
only a single copy of an object will exist in its own collection instead of being
duplicated on multiple objects in a single collection. Also, if you modify the
information in the subobject frequently, you only need to modify a single instance
rather than every record in the object’s collection that has that subobject.
A major disadvantage of normalizing data is that when looking up user objects that
require the normalized subobject, a separate lookup must occur to link the subobject.
This can result in a significant performance hit if you are accessing the user data
frequently.
An example of when it makes sense to normalize data is a system that contains users
that have a favorite store. Each User is an object with name, phone, and
favoriteStore properties. The favoriteStore property is also a subobject
that contains name, street, city, and zip properties.
However, thousands of users may have the same favorite store, so there is a high
one-to-many relationship. Therefore, it doesn’t make sense to store the
FavoriteStore object data in each User object because it would result in
thousands of duplications. Instead, the FavoriteStore object should include an
_id object property that can be referenced from documents in the user’s
FavoriteStore. The application can then use the reference ID
favoriteStore to link data from the Users collection to FavoriteStore
documents in the FavoriteStores collection.
Figure 11.1 illustrates the structure of the Users and FavoriteStores
collections just described.
Figure 11.1 Defining normalized MongoDB documents by adding a reference to
documents in another collection
Denormalizing Data with Embedded Documents
Denormalizing data is the process of identifying subobjects of a main object that
should be embedded directly into the document of the main object. Typically this is
done on objects that have a mostly one-to-one relationship or are relatively small and
do not get updated frequently.
The major advantage of denormalized documents is that you can get the full object
back in a single lookup without the need to do additional lookups to combine
subobjects from other collections. This is a major performance enhancement. The
downside is that for subobjects with a one-to-many relationship you store a separate
copy in each document, which slows down insertion and also takes up additional
disk space.
An example of when it makes sense to normalize data is a system that contains users
with home and work contact information. The user is an object represented by a
User document with name, home, and work properties. The home and work
properties are subobjects that contain phone, street, city, and zip properties.
The home and work properties do not change often on the user. You may have
multiple users from the same home; however, there likely will not be many of them,
and the actual values inside the subobjects are not that big and will not change often.
Therefore, it makes sense to store the home contact information directly in the User
object.
The work property takes a bit more thinking. How many people are going to have
the same work contact information? If the answer is not many, then the work object
should be embedded with the User object. How often are you querying the User
and need the work contact information? If the answer is rarely, then you may want
to normalize work into its own collection. However, if the answer is frequently or
always, then you will likely want to embed work with the User object.
Figure 11.2 illustrates the structure of the Users with Home and work contact
information embedded as just described.
Figure 11.2 Defining denormalized MongoDB documents by implementing
embedded objects inside a document
Using Capped Collections
A great feature of MongoDB is the ability to create a capped collection, which is a
collection that has a fixed size. When a new document that exceeds the size of the
collection needs to be written to a collection, the oldest document in the collection is
deleted and the new document is inserted. Capped collections work great for objects
that have a high rate of insertion, retrieval, and deletion.
The following list contains the benefits of using capped collections:
Capped collections guarantee that the insertion order is preserved. Queries do
not need to use an index to return documents in the order they were stored, thus
eliminating indexing overhead.
Capped collections also guarantee that the insertion order is identical to the
order on disk by prohibiting updates that increase the document size. This
eliminates the overhead of relocating and managing the new location of
documents.
Capped collections automatically remove the oldest documents in the collection.
Therefore, you do not need to implement deletion in your application code.
Be careful using capped collections, though, as they have the following restrictions:
Documents cannot be updated to a larger size once they have been inserted into
the capped collection. You update them, but the data must be the same size or
smaller.
Documents cannot be deleted from a capped collection. That means that the
data takes up space on disk even if it is not being used. You can explicitly drop
the capped collection to effectively delete all entries, but you need to re-create it
to use it again.
A great use of capped collections is as a rolling log of transactions in your system.
You can always access the last X number of log entries without needing to explicitly
clean up the oldest.
Understanding Atomic Write Operations
Write operations are atomic at the document level in MongoDB, which means that
only one process can update a single document or a single collection at the same
time. This means that writing to documents that are denormalized is atomic.
However, writing to documents that are normalized requires separate write
operations to subobjects in other collections, and therefore the writes of the
normalized object may not be atomic as a whole.
Keep atomic writes in mind when designing your documents and collections to
ensure that the design fits the needs of the application. In other words, if you
absolutely must write all parts of an object as a whole in an atomic manner, then you
need to design the object in a denormalized fashion.
Considering Document Growth
When you update a document, consider what effect the new data will have on
document growth. MongoDB provides some padding in documents to allow for
typical growth during an update operation. However, if the update causes the
document to grow to an amount that exceeds the allocated space on disk, MongoDB
has to relocate that document to a new location on the disk, incurring a performance
hit on the system. Also, frequent document relocation can lead to disk fragmentation
issues—for example, if a document contains an array and you add enough elements
to the array.
One way to mitigate document growth is to use normalized objects for the properties
that may grow frequently. For example, instead of using an array to store items in a
Cart object, you could create a collection for CartItems and store new items that
get placed in the cart as new documents in the CartItems collection and then
reference the user’s Cart items within them.
Identifying Indexing, Sharding, and Replication
Opportunities
MongoDB provides several mechanisms to optimize performance, scaling, and
reliability. As you contemplate your database design, consider each of the following
options:
Indexing: Indexes improve performance for frequent queries by building a
lookup index that can be easily sorted. The _id property of a collection is
automatically indexed on since it is a common practice to look items up by ID.
However, you also need to consider what other ways users access data and
implement indexes that will enhance those lookup methods.
Sharding: Sharding is the process of slicing up large collections of data that
can be split between multiple MongoDB servers in a cluster. Each MongoDB
server is considered a shard. This provides the benefit of using multiple servers
to support a high number of requests to a large system, thus providing
horizontal scaling to your database. Look at the size of your data and the
amount of requests that will be accessing it to determine whether and how much
to shard your collections.
Replications: Replication is the process of duplicating data on multiple
MongoDB instances in a cluster. When considering the reliability aspect of your
database, you should implement replication to ensure that a backup copy of
critical data is always readily available.
Large Collections Versus Large Numbers of Collections
Another important thing to consider when designing your MongoDB documents and
collections is the number of collections that the design will result in. There isn’t a
significant performance hit for having a large number of collections; however, there
is a performance hit for having large numbers of items in the same collection.
Consider ways to break up your larger collections into more consumable chunks.
For example, say that you store a history of user transactions in the database for past
purchases. You recognize that for these completed purchases, you will never need to
look them up together for multiple users. You only need them available for the user
to look at his or her own history. If you have thousands of users who have a lot of
transactions, then it makes sense to store those histories in a separate collection for
each user.
Deciding on Data Life Cycles
One of the most commonly overlooked aspects of database design is that of the data
life cycle. Specifically, how long should documents exist in a specific collection?
Some collections have documents that should be indefinite, for example, active user
accounts. However, keep in mind that each document in the system incurs a
performance hit when querying a collection. You should define a TTL or time-tolive value for documents in each of your collections.
There are several ways to implement a time-to-live mechanism in MongoDB. One
way is to implement code in your application to monitor and clean up old data.
Another way is to use the MongoDB TTL setting on a collection, which allows you
to define a profile where documents are automatically deleted after a certain number
of seconds or at a specific clock time. For collections where you only need the most
recent documents, you can implement a capped collection that automatically keeps
the size of the collection small.
Considering Data Usability and Performance
Two more important things to consider when designing a MongoDB database are
data use and how it will affect performance. The previous sections described
different methods for solving some complexities of data size and optimization. The
final things you should consider and even reconsider are data usability and
performance. Ultimately, these are the two most important aspects of any web
solution and, consequently, the storage behind it.
Data usability describes the ability for the database to satisfy the functionality of the
website. First, you must make sure that the data can be accessed so that the website
functions correctly. Users will not tolerate a website that simply does not do what
they want it to. This also includes the accuracy of the data.
Then you can consider performance. Your database must be able to deliver the data
at a reasonable rate. You can consult the previous sections when evaluating and
designing the performance factors for your database.
In some more complex circumstances, you may find it necessary to evaluate data
usability and then performance and then go back and evaluate usability again for a
few cycles until you get the balance correct. Also, keep in mind that in today’s
world, usability requirements can change at any time. Remembering that can
influence how you design your documents and collections so that they can become
more scalable in the future if necessary.
Summary
In this chapter you learned about MongoDB and design considerations for the
structure of data and configuration of a database. You learned about collections,
documents, and the types of data that can be stored in them. You also learned how to
plan your data model, what questions you need to answer, and the mechanisms built
in to MongoDB to satisfy the demands your database needs.
Next
In the next chapter, you install MongoDB. You also learn how to use the MongoDB
shell to set up user accounts and access collections and documents.
12
Getting Started with MongoDB
This chapter gets you up to speed with MongoDB. Whereas Chapter 11,
“Understanding NoSQL and MongoDB,” focused more on the theory side of
MongoDB, this chapter is all about practical application. You learn what it takes to
install MongoDB, start and stop the engine, and access the MongoDB shell. The
MongoDB shell allows you to administer the MongoDB server as well as perform
every necessary function on the databases. Using the MongoDB shell is a vital aspect
of the development process as well as database administration.
This chapter covers installing MongoDB and accessing the shell. The chapter
focuses on some basic administrative tasks such as setting up user accounts and
authentication. The chapter then wraps up by describing how to administer
databases, collections, and documents.
Building the MongoDB Environment
To get started with MongoDB, the first task is to install it on your development
system. Once installed on your development system, you can play around with the
functionality, learn the MongoDB shell, and prepare for Chapter 13, “Getting Started
with MongoDB and Node.js,” in which you begin integrating MongoDB into your
Node.js applications.
The following sections cover installation, starting and stopping the database engine,
and accessing the shell client. Once you can do those things you are ready to begin
using MongoDB in your environment.
Installing MongoDB
The first step in getting MongoDB implemented into your Node.js environment is
installing the MongoDB server. There is a version of MongoDB for each of the
major platforms, including Linux, Windows, Solaris, and OS X. There is also an
enterprise version available for the Red Hat, SuSE, Ubuntu, and Amazon Linux
distributions. The enterprise version of MongoDB is subscription-based and provides
enhanced security, management, and integration support.
For the purposes of this book and learning MongoDB, the standard edition of
MongoDB is perfect. Before continuing, go to the MongoDB website at
http://docs.mongodb.org/manual/installation/. Follow the links and instructions to
download and install MongoDB in your environment:
As part of the installation and setup process, perform the following steps:
1. Download and extract the MongoDB files.
2. Add the /bin to your system path.
3. Create a data files directory: /data/db.
4. Start MongoDB using the following command from the console prompt:
mongod –dbpath /data/db
Starting MongoDB
Once you have installed MongoDB, you need to be able to start and stop the
database engine. The database engine starts by executing the mongod
(mongod.exe on Windows) executable in the
/bin location. This executable starts MongoDB
and begins listening for database requests on the configured port.
The mongod executable accepts several different parameters that provide methods
of controlling its behavior. For example, you can configure the IP address and
port MongoDB listens on as well as logging and authentication. Table 12.1
provides a list of some of the most commonly used parameters.
Here is an example of starting MongoDB with a port and dbpath parameters:
mongod –port 28008 –dbpath /data/db
Table 12.1 mongod command-line parameters
Parameter
--help, -h
Description
--version
Returns the version of MongoDB.
--config
,
Specifies a configuration file that contains runtimeconfigurations.
Returns basic help and usage text.
-f
--verbose, -v
Increases the amount of internal reporting sent to the
console and written to the log file specified by -logpath.
--quiet
Reduces the amount of reporting sent to the console and log
file.
--port
Specifies a TCP port for mongod to listen for client
connections. Default: 27017.
--bind_ip
Specifies the IP address on which mongod will bind to and
listen for connections. Default: All Interfaces
--maxConns
Specifies the maximum number of simultaneous
connections that mongod will accept. Max: 20000
--logpath
Specifies a path for the log file. On restart, the log file is
overwritten unless you also specify --logappend.
--auth
Enables database authentication for users connecting from
remote hosts.
--dbpath
Specifies a directory for the mongod instance to store its
data.
-nohttpinterface
Disables the HTTP interface.
--nojournal
Disables the durability journaling.
--noprealloc
Disables the preallocation of data files, which shortens the
startup time but can cause significant performance penalties
during normal operations.
--repair
Runs a repair routine on all databases.
Stopping MongoDB
Each platform has different methods of stopping the mongod executable once it has
started. However, one of the best methods is to stop it from the shell client because
that cleanly shuts down the current operations and forces the mongod to exit.
To stop the MongoDB database from the shell client, use the following commands to
switch to the admin database and then shut down the database engine:
use admin
db.shutdownServer()
Accessing MongoDB from the Shell Client
Once you have installed, configured, and started MongoDB, you can access it
through the MongoDB shell. The MongoDB shell is an interactive shell provided
with MongoDB that allows you to access, configure, and administer MongoDB
databases, users, and much more. You use the shell for everything from setting up
user accounts to creating databases to querying the contents of the database.
The following sections take you through some of the most common administration
tasks that you perform in the MongoDB shell. Specifically, you need to be able to
create user accounts, databases, and collections to follow the examples in the rest of
the book. Also you should be able to perform at least rudimentary queries on
documents to help you troubleshoot any problems with accessing data.
To start the MongoDB shell, first make sure that mongod is running, and then run
the mongod command, then execute the mongo command from the console prompt.
The shell should start up as shown in Figure 12.1.
Figure 12.1 Starting the MongoDB console shell
Once you have accessed the MongoDB shell, you can administer all aspects of
MongoDB. There are a couple of things to keep in mind when using MongoDB.
First is that it is based on JavaScript and most of its syntax is available. Second, the
shell provides direct access to the database and collections on the server so changes
you make directly impact the data on the server.
Understanding MongoDB Shell commands
The MongoDB shell provides several commands that can be executed from the shell
prompt. You need to be familiar with these commands as you will use them a lot.
The following list describes each command and its purpose:
help