JavaScript Regular Expressions

JavaScript Regular Expressions

English | 2015 | ISBN: 978-1-78328-225-8 | 112 Pages | PDF, EPUB, MOBI +code | 10 MB


Regular expressions are patterns or templates that allow you to define a set of rules in a natural yet vague way, giving you the ability to match and validate text. Therefore, they have been implemented in nearly every modern programming language. JavaScript’s implementation allows us to perform complex tasks with a few lines of code using regular expressions to match and extract data out of text.
This book starts by exploring what a pattern actually is and how regular expressions express these patterns to match and manipulate user data. You then move on to learning about the use of character classes to define a wild character match, a digit match, and an alphanumeric match. You will then learn to manipulate text and shorten data in URLs, paths, markup, and data exchange, as well as other advanced Regex features.
Finally, you will work through real-world examples, both in the browser and on the server side using Node.js.

+

Special Characters

Capture and noncapture groups

In the first chapter, we saw an example where we wanted to parse some kind of XML tokens, and we said that we needed an extra constraint where the closing tag had to
match the opening tag for it to be valid. So, for example, this should be parsed:

Since the closing tag doesn’t match the opening tag, the way to reference previous groups in your pattern is by using a backslash character, followed by the group’s index number. As an example, let’s write a small script that will accept a line delimited series of XML tags, and then convert it into a JavaScript object. To start with, let’s create an input string:

Node.js and Regex

Creating the Apache log Regex

In the Apache access log file, we have nine parts that we want to recognize and extract from each line of the file. We can try two approaches while creating a Regex: we can be very specific or more generic. As mentioned previously, the most powerful regular expressions are the ones that are generic. We will try to achieve these expressions in this chapter as well.

Creating a JSON object for each row

Let’s try to do something more useful with each row of the Apache log. We are going to create a JavaScript Object Notation(JSON) object with each row and add it to an array. To wrap our application, we will save the JSON content into a file.

So after the Regex declaration (which is inside the vardeclaration), we are going to add a new variable that will hold the collection of JSONobjects we are going to create:

In this chapter, we learned how to create a simple Node.js application that read an Apache log file and extracted the log information using a regular expression. We were able to put in to practice the knowledge we acquired in the previous chapters of the book.

We also learned that to create a very complex Regex, it is best to do it in parts. We learned that we can be very specific while creating a regular expression or we can be more generic and achieve the same results.

As a new version of EcmaScriptis being created (EcmaScript 6, which will add lots of new features to JavaScript), it is good to familiarize yourself with the improvements related to regular expressions as well. For more information please visit www.ecmascript.org/dev.php.

We hope you enjoy the book! Have fun creating regular expressions!