FacebookTwitterDiggDeliciousStumbleuponGoogle BookmarksRedditLinkedin

Levenshtein distance MySQL stored function

User Defined Functions are native functions that are compiled and then loaded by the server similar to a plugin. As with plugins, compiling can be a pest as it requires some of the server MySQL header files and matching build switches to the server it’s going to be loaded in. Consequentially, binaries cannot be considered safely portable and that means that you don’t really want to have a project rely on UDFs as it can hinder adoption quite severely.

Since MySQL 5.0 we can also use SQL stored functions and procedures. Slower, of course, but functional and portable. By the way, there’s one thing you can do with UDFs that you (at least currently) can’t do with stored functions, and that’s create a new aggregate function (like SUM or COUNT).

Levenshtein reference: http://en.wikipedia.org/wiki/Levenshtein_distance

The code

 

DELIMITER $$
DROP FUNCTION IF EXISTS LEVENSHTEIN $$
CREATE FUNCTION LEVENSHTEIN(s1 VARCHAR(255) CHARACTER SET utf8, s2 VARCHAR(255) CHARACTER SET utf8)
  RETURNS INT
  DETERMINISTIC
  BEGIN
    DECLARE s1_len, s2_len, i, j, c, c_temp, cost INT;
    DECLARE s1_char CHAR CHARACTER SET utf8;
    -- max strlen=255 for this function
    DECLARE cv0, cv1 VARBINARY(256);

    SET s1_len = CHAR_LENGTH(s1),
        s2_len = CHAR_LENGTH(s2),
        cv1 = 0x00,
        j = 1,
        i = 1,
        c = 0;

    IF (s1 = s2) THEN
      RETURN (0);
    ELSEIF (s1_len = 0) THEN
      RETURN (s2_len);
    ELSEIF (s2_len = 0) THEN
      RETURN (s1_len);
    END IF;

    WHILE (j <= s2_len) DO
      SET cv1 = CONCAT(cv1, CHAR(j)),
          j = j + 1;
    END WHILE;

    WHILE (i <= s1_len) DO
      SET s1_char = SUBSTRING(s1, i, 1),
          c = i,
          cv0 = CHAR(i),
          j = 1;

      WHILE (j <= s2_len) DO
        SET c = c + 1,
            cost = IF(s1_char = SUBSTRING(s2, j, 1), 0, 1);

        SET c_temp = ORD(SUBSTRING(cv1, j, 1)) + cost;
        IF (c > c_temp) THEN
          SET c = c_temp;
        END IF;

        SET c_temp = ORD(SUBSTRING(cv1, j+1, 1)) + 1;
        IF (c > c_temp) THEN
          SET c = c_temp;
        END IF;

        SET cv0 = CONCAT(cv0, CHAR(c)),
            j = j + 1;
      END WHILE;

      SET cv1 = cv0,
          i = i + 1;
    END WHILE;

    RETURN (c);
  END $$

DELIMITER ;

 

Read more about this in the original post from stoyanstefanov in openquery.com.au

 

* If you are thinking to use this MySQL function in production, I highly recommend to take a look to this version bellow, it will give you much better performance.

Damerau Levenshtein

How to limit results from join

I found this as a very common question when people start in sql. You can use sub-queries but I wouldn't recommend that as its slower and not scalable. So.. I like the method I show you bellow:

Create a variable and save on it the row number(@num).
 
We also need another variable to save the "page" id as its necessary to restart the row count when a different page appears.
 
Finally we filter by @num with as much rows we want. Remember matchNO is a calculated field so you cant filter in a where statment, use having instead.

The code

set @num := 0, @parent := 0;
SELECT p.[yourFields],IF(@parent = p.id,@num:=@num+1,@num:=1) as matchNO, @parent:=p.id
	FROM parent p
		LEFT JOIN child c on c.p_id = p.id
			WHERE [yourFilters]
				having matchNO <= 3;

 

The row number is necessary to figure out in what row you are. The page variable tell you the page in the last row so you can compare and then set @num to 1 again when necesary.

sleep() in JavaScript

PHP has a sleep() function, but JavaScript doesn't. Well, this is because it's useless, you might say, and you'll be right. But for simulating heavy processing and for misc performance measurements, it could be useful. So here's how you can go about creating a sleep() in JavaScript.

The code

function sleep(milliseconds) {
  var start = new Date().getTime();
  for (var i = 0; i < 1e7; i++) {
    if ((new Date().getTime() - start) > milliseconds){
      break;
    }
  }
}

Usage

Sleep for 1 second (1000 milliseconds):

console.log(new Date());
console.log('Dude!');
sleep(1000);
console.log(new Date());

 

Read more about this in the original post from stoyanstefanov in phpied.com

Freedom Code

All existing content on this site can be used and modified to pleasure the developer who implements it. 

There are full and complete freedom to use the code. Developers can collaborate on this projects or recommend changes for improve it.

Contact