Split delimited string value into rows – This article will take you through the common SQL errors that you might encounter while working with mysql, sql, database. The wrong arrangement of keywords will certainly cause an error, but wrongly arranged commands may also be an issue. SQL keyword errors occur when one of the words that the SQL query language reserves for its commands and clauses is misspelled. If the user wants to resolve all these reported errors, without finding the original one, what started as a simple typo, becomes a much bigger problem.
SQL Problem :
Some external data vendor wants to give me a data field – pipe delimited string value, which I find quite difficult to deal with.
Without help from an application programming language, is there a way to transform the string value into rows?
There is a difficulty however, the field has unknown number of delimited elements.
DB engine in question is MySQL.
For example:
Input: Tuple(1, "a|b|c")
Output:
Tuple(1, "a")
Tuple(1, "b")
Tuple(1, "c")
Solution :
It may not be as difficult as I initially thought.
This is a general approach:
- Count number of occurrences of the delimiter
length(val) - length(replace(val, '|', ''))
- Loop a number of times, each time grab a new delimited value and insert the value to a second table.
Use this function by Federico Cargnelutti:
CREATE FUNCTION SPLIT_STR(
x VARCHAR(255),
delim VARCHAR(12),
pos INT
)
RETURNS VARCHAR(255)
RETURN REPLACE(SUBSTRING(SUBSTRING_INDEX(x, delim, pos),LENGTH(SUBSTRING_INDEX(x, delim, pos -1)) + 1),
delim, '');
Usage
SELECT SPLIT_STR(string, delimiter, position)
you will need a loop to solve your problem.
Although your issue is probably long time resolved, I was looking for a solution to the very same problem you had. I solved it with the help of a procedure referenced here with slight adaptions to serve multi-byte characters (such as the German Umlauts) in the string by using CHAR_LENGTH()
instead of LENGTH()
.
DELIMITER $$
CREATE FUNCTION SPLIT_STRING(val TEXT, delim VARCHAR(12), pos INT) RETURNS TEXT
BEGIN
DECLARE output TEXT;
SET output = REPLACE(SUBSTRING(SUBSTRING_INDEX(val, delim, pos), CHAR_LENGTH(SUBSTRING_INDEX(val, delim, pos - 1)) + 1), delim, '');
IF output = '' THEN
SET output = null;
END IF;
RETURN output;
END $$
CREATE PROCEDURE TRANSFER_CELL()
BEGIN
DECLARE i INTEGER;
SET i = 1;
REPEAT
INSERT INTO NewTuple (id, value)
SELECT id, SPLIT_STRING(value, '|', i)
FROM Tuple
WHERE SPLIT_STRING(value, '|', i) IS NOT NULL;
SET i = i + 1;
UNTIL ROW_COUNT() = 0
END REPEAT;
END $$
DELIMITER ;
CALL TRANSFER_CELL() ;
DROP FUNCTION SPLIT_STRING ;
DROP PROCEDURE TRANSFER_CELL ;
Finding SQL syntax errors can be complicated, but there are some tips on how to make it a bit easier. Using the aforementioned Error List helps in a great way. It allows the user to check for errors while still writing the project, and avoid later searching through thousands lines of code.