Regex for matching text that does not contain a given word

Often we would like to see if a input string is present in the given text or not. While, regular expressions are not the best way to do this, it is useful to understand how we can achieve this using regular expressions. I will present here two different regular expressions which match the text that does not contain a given input word. Both use negative lookahead but behave differently on long input text. We read the text to match from a file, so that we can test our regex against arbitrarily long text. The word that should not occur is passed as the second argument as this is generally short in length.
This is the first regex

var fs = require('fs');
var inputStr = fs.readFileSync(process.argv[2])
                 .toString().replace('\n','');
//regex is /^((?!word).)*$/
var matcher = new RegExp('^((?!' + process.argv[3] + ').)*$');
console.log(matcher.test(inputStr));

download
and this is the second regex

var fs = require('fs');
var inputStr = fs.readFileSync(process.argv[2])
                 .toString().replace('\n','');
//regex is /^(?!.*word).*$/
var matcher = new RegExp('^(?!.*' + process.argv[3] + ').*$');
console.log(matcher.test(inputStr));

download (3)
First regular expression causes stack overflow error by some regex engines (tested with JavaScript/Java) when input text is long and does not contain the given word. It takes twice the number of steps to find the match when compared to the steps taken by second regex.


Parsing a sms thread in a readable way on Android

First get SMS Backup & Restore app from Play store and backup all the sms and then take the xml and run the following snippet to generate a readable conversation in a text file.

from __future__ import print_function
from xml.dom import minidom

xmldoc = minidom.parse('/Users/narendrayadala/Downloads/sms-20130317085953.xml')
sms_file = open('/Users/narendrayadala/Downloads/Jyothi.txt', 'w')
smslist = xmldoc.getElementsByTagName('sms')

for sms in smslist:
    if sms.attributes['contact_name'].value == 'Jyothi Nerella':
        if sms.attributes['type'].value == '1':
            print(sms.attributes['readable_date'].value + u' - Jyothi: ' + sms.attributes['body'].value, file=sms_file)
        else:
            print(sms.attributes['readable_date'].value + u' - Narendra: ' + sms.attributes['body'].value, file=sms_file)

Indenting blocks of text easily on MacVim

Being a windows user previously, I prefer selecting text using Shift+Arrow keys and increase/decrease indent of the text using Tab/Shift-Tab keys. Doing this on MacVim requires a little fiddling with the .vimrc file. The following needs to be added to the .vimrc file.

vnoremap <Tab> >gv
vnoremap <S-Tab> <gv

if has("gui_macvim")
    let macvim_hig_shift_movement = 1
endif

First section of the code causes vim to use Tab key as a mapping for >gv key sequence and second section of the code causes vim to use Shift+Arrow keys to select the text.


Using bind method of Function object

When new elements are dynamically added to the DOM, appropriate event handlers need to be attached to them. When attaching event handlers proper context needs to be set so that the event handler routines can use this object in a productive way. Since JavaScript 1.8.5 doing this is made simple using bind method of the Function object. Prior to the introduction of bind method, Function.apply method is used to set the this context. The following html/js code illustrates a typical usage of bind method.

<!DOCTYPE html>
<head>
   <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.1/jquery.min.js"></script>
</head>
<html>
	<body>
		<div id="container"></div>
	</body>
	<script type="text/javascript">
		function changeBackgroundColor(){
			$(this).css('background-color', this.attr('data-backgroundcolor'));
		}
                //the following data is generally obtained from server as a result of some request
		var newDiv = $("<div id='newDiv' data-backgroundcolor='#abc'><h4>Sample content from server</h4></div>")
		newDiv.bind('click', changeBackgroundColor.bind(newDiv));
		$('#container').append(newDiv);
	</script>
</html>

Here changeBackgroundColor.bind(newDiv) creates a new bound function with newDiv as this context inside changeBackgroundColor method.


A simple shell script to verify checksums

Often I find myself comparing checksums manually which is a bit cumbersome, so I wrote this trivial shell script to automate the verification.

#!/bin/bash
if [ $# -lt 3 ] ; then
        echo 'usage -- compare-checksum.sh hashfunction filename checksum'
        exit 1
elif [ `$1 $2 | awk '{print $1}'` = $3 ] ; then
        echo 'File not corrupted. We are saved.';
else
        echo 'File corrupted. We are doomed.';
fi

Usage:

compare-checksum.sh sha1sum foobar.txt 988881adc9fc3655077dc2d4d757d480b5ea0e11

If there are multiple files and their corresponding checksums, then one can write them in a file in the following format

988881adc9fc3655077dc2d4d757d480b5ea0e11  foobar1.txt
988881adc9fc3655077dc2d4d757d480b5ea0e11  foobar2.txt
988881adc9fc3655077dc2d4d757d480b5ea0e11  foobar3.txt
988881adc9fc3655077dc2d4d757d480b5ea0e11  foobar4.txt

and run the following command to verify all of the checksums at once

sha1sum -c checksum.txt

Generating all possible postfix expressions

The following Java program returns the list of all possible postfix expressions for a given list of operands and operators. The number of all postfix expressions grows exponentially with the number of operands and is around 5 billion for 10 operands and 4 operators (*,+,/,-). The precise number of all possible postfix expressions is C(n-1)*(noOfOperators^(n-1)) where n is the number of operands and C(n-1) is the n-1th Catalan number.

import java.util.ArrayList;
import java.util.List;
import java.util.Stack;

public class PostfixExpressionGeneratorSansRecursion {
	private static final String [] OPERATORS = new String[] {"*", "+", "/", "-"};
	
	public static void main(String[] args) {
		List<String> x = new ArrayList<String>();
		x.add("1");
		x.add("2");
		x.add("3");
		x.add("4");
		
		for (String ex : generateAllPostfixExpressions(x)){
			System.out.println(ex);
		}
	}
	
	/**
	 * Assumes input is a list of single digit positive numbers.
	 * @param operands
	 * @return list of all possible postfix expressions for the given list of operands
	 */
	private static List<String> generateAllPostfixExpressions(List<String> operands) {
		List<String> allPossiblePostFixExprs = new ArrayList<String>();
		Stack<String> stack = new Stack<String>();
		stack.push(""); //start with a empty partial expression
		int finalPFExprLength = 2*operands.size() - 1;
		int sizeOfOperands = operands.size();
		
		while (!stack.isEmpty()) {
			String partialExpr = stack.pop();
			int partialExprLength = partialExpr.length();
			
			if (partialExprLength == finalPFExprLength) {
				allPossiblePostFixExprs.add(partialExpr);
				continue;
			}
			
			int noOfOprdsConsumed = 0;
			for (char c : partialExpr.toCharArray())
				noOfOprdsConsumed += Character.isDigit(c) ? 1 : 0;
			
			if (2*noOfOprdsConsumed - partialExprLength > 1)
				for (String op : OPERATORS)
					stack.push(partialExpr + op);
			
			if (noOfOprdsConsumed != sizeOfOperands)
				stack.push(partialExpr + operands.get(noOfOprdsConsumed));
		}
		
	    return allPossiblePostFixExprs;
	}
}

Copying large files — BufferedReader vs FileChannel

When copying large files java.nio.channels.FileChannel takes 20% less time than java.io.BufferedReader. On an average FileChannel takes 25 seconds for each run and BufferedReader takes 30 seconds for each run. Here is the code used for the performance test. Size of test.txt is 1 GB. Used a buffer of size 8 KB as it was giving the best results. [Environment - -Windows 7/3GB RAM/i3 CPU M 370 @ 2.40GHz]

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;

public class CopyFile {
	//Size of the test file is 1GB.
	private static final String INPUT_FILE_PATH = "test.txt";
	private static final String OUTPUT_FILE_PATH_FOR_FILE_CHANNEL = "test1.txt";
	private static final String OUTPUT_FILE_PATH_FOR_BUFFERED_READER = "test2.txt";
	private static final int DEFAULT_BUFFER_SIZE = 1024 * 8;

	public static void main(String[] args) throws Exception {
		Timer t = new Timer();

		t.start();
		copyFileUsingBufferedReader();
		System.out.println("Copying file using buffered reader takes " + t.end() 
                                                         + " milliseconds");

		t.start();
		copyFileUsingFileChannel();
		System.out.println("Reading file using file channel takes " + t.end() 
                                                         + " milliseconds");
	}

	private static void copyFileUsingFileChannel() throws IOException {
		FileChannel source = null;
		FileChannel destination = null;
		
		try {
			source = new FileInputStream(new File(INPUT_FILE_PATH)).getChannel();
			destination = new FileOutputStream(
                        new File(OUTPUT_FILE_PATH_FOR_FILE_CHANNEL)).getChannel();
			
			//This fails with Map Failed exception on large files
			//destination.transferFrom(source, 0, source.size());
			
			ByteBuffer buf = ByteBuffer.allocateDirect(DEFAULT_BUFFER_SIZE);
	        while((source.read(buf)) != -1) {
	                buf.flip();
	                destination.write(buf);
	                buf.clear();
	        }
		} finally {
			if (source != null) {
				source.close();
			}
			if (destination != null) {
				destination.close();
			}
		}
	}

	private static void copyFileUsingBufferedReader() throws IOException {

		BufferedInputStream source = new BufferedInputStream(
                new FileInputStream(new File(INPUT_FILE_PATH)));
		BufferedOutputStream destination = new BufferedOutputStream(
                new FileOutputStream(new File(OUTPUT_FILE_PATH_FOR_BUFFERED_READER)));
		byte[] buffer = new byte[DEFAULT_BUFFER_SIZE];

		try {
	        int n = 0;
	        while (-1 != (n = source.read(buffer))) {
	        	destination.write(buffer, 0, n);
	        }
	        destination.flush();
		} finally {
			if (source != null) {
				source.close();	
			}
			if (destination != null) {
				destination.close();				
			}
		}
	}

}

class Timer {
	long s;

	public Timer() {
	}

	public long start() {
		s = System.currentTimeMillis();
		return s;
	}

	public long end() {
		return System.currentTimeMillis() - s;
	}
}

Follow

Get every new post delivered to your Inbox.