- Transact-SQL (T-SQL)
This is a quick article on how to split a single row into multipe rows based on the value of a column in the same table.
Why?
I have a table that has all the days of sickness of employees. This table contains, which employee, on what date, and for how many days they were off sick. When migrating to a new system, the destination wanted 1 row per day. This meant that if in the old system, there was 1 row with an employee who took 2 days off, we would want 2 rows for that date for the same employee in the new system.
What we have:
SELECT EmployeeNo , DaysOffSick , DateOfSickness FROM Employees_Attendance_Table -- yields EmployeeNo DaysOffSick DateOfSickness -------------- -------------- ---------------- 001 1.00 1997-11-17 002 3.00 2000-02-18 003 2.00 1999-02-25
- SELECT
- EmployeeNo
- , DaysOffSick
- , DateOfSickness
- FROM Employees_Attendance_Table
- -- yields
- EmployeeNo DaysOffSick DateOfSickness
- -------------- -------------- ----------------
- 001 1.00 1997-11-17
- 002 3.00 2000-02-18
- 003 2.00 1999-02-25
EmployeeNo DaysOffSick DateOfSickness -------------- -------------- ---------------- 001 1.00 1997-11-17 002 1.00 2000-02-18 002 1.00 2000-02-19 002 1.00 2000-02-20 003 1.00 1999-02-25 003 1.00 1999-02-26 -- note the dates increment and do not account for days off (eg. Saturday / Sunday)
- EmployeeNo DaysOffSick DateOfSickness
- -------------- -------------- ----------------
- 001 1.00 1997-11-17
- 002 1.00 2000-02-18
- 002 1.00 2000-02-19
- 002 1.00 2000-02-20
- 003 1.00 1999-02-25
- 003 1.00 1999-02-26
- -- note the dates increment and do not account for days off (eg. Saturday / Sunday)
How?
Adapted from a forum topic: SQLTeam Forums
First of all, we want the rows to multiply based on the value of the column:
SELECT EmployeeNo , DaysOffSick , DateOfSickness ,'1' + substring(CAST(DaysOffSick AS VARCHAR(10)), 2, 1000) AS Items FROM Employees_Attendance_Table JOIN master..spt_values n ON n.type = 'P' AND n.number < CONVERT(int, LEFT(CAST( DaysOffSick AS INT ),3)) -- yields EmployeeNo DaysOffSick DateOfSickness Items -------------- -------------- ---------------- -------- 001 1.00 1997-11-17 1.00 002 3.00 2000-02-18 1.00 002 3.00 2000-02-18 1.00 002 3.00 2000-02-18 1.00 003 2.00 1999-02-25 1.00 003 2.00 1999-02-25 1.00
- SELECT
- EmployeeNo
- , DaysOffSick
- , DateOfSickness
- ,'1' + substring(CAST(DaysOffSick AS VARCHAR(10)), 2, 1000) AS Items
- FROM
- Employees_Attendance_Table
- JOIN
- master..spt_values n
- ON n.type = 'P'
- AND n.number < CONVERT(int, LEFT(CAST( DaysOffSick AS INT ),3))
- -- yields
- EmployeeNo DaysOffSick DateOfSickness Items
- -------------- -------------- ---------------- --------
- 001 1.00 1997-11-17 1.00
- 002 3.00 2000-02-18 1.00
- 002 3.00 2000-02-18 1.00
- 002 3.00 2000-02-18 1.00
- 003 2.00 1999-02-25 1.00
- 003 2.00 1999-02-25 1.00
Almost there, but in the above example, the dates are wrong for subsequent days off sick:
SELECT EmployeeNo , DaysOffSick , DateOfSickness ,'1' + substring(CAST(DaysOffSick AS VARCHAR(10)), 2, 1000) AS Items , ROW_NUMBER() OVER (PARTITION BY EmployeeNo, DateOfSickness ORDER BY EmployeeNo, DateOfSickness) AS Rownum FROM Employees_Attendance_Table JOIN master..spt_values n ON n.type = 'P' AND n.number < CONVERT(int, LEFT(CAST( DaysOffSick AS INT ),3)) -- yields EmployeeNo DaysOffSick DateOfSickness Items Rownum -------------- -------------- ---------------- ----------- ----------------- 001 1.00 1997-11-17 1.00 1 002 3.00 2000-02-18 1.00 1 002 3.00 2000-02-18 1.00 2 002 3.00 2000-02-18 1.00 3 003 2.00 1999-02-25 1.00 1 003 2.00 1999-02-25 1.00 2
- SELECT
- EmployeeNo
- , DaysOffSick
- , DateOfSickness
- ,'1' + substring(CAST(DaysOffSick AS VARCHAR(10)), 2, 1000) AS Items
- , ROW_NUMBER() OVER (PARTITION BY EmployeeNo, DateOfSickness ORDER BY EmployeeNo, DateOfSickness) AS Rownum
- FROM
- Employees_Attendance_Table
- JOIN
- master..spt_values n
- ON n.type = 'P'
- AND n.number < CONVERT(int, LEFT(CAST( DaysOffSick AS INT ),3))
- -- yields
- EmployeeNo DaysOffSick DateOfSickness Items Rownum
- -------------- -------------- ---------------- ----------- -----------------
- 001 1.00 1997-11-17 1.00 1
- 002 3.00 2000-02-18 1.00 1
- 002 3.00 2000-02-18 1.00 2
- 002 3.00 2000-02-18 1.00 3
- 003 2.00 1999-02-25 1.00 1
- 003 2.00 1999-02-25 1.00 2
Great! Now let's just add the Rownum value to the date of sickness (minus 1 for obvious reasons - 25th + row(1) = 26th when the person was off the 25th):
SELECT EmployeeNo , '1' + substring(CAST(DaysOffSick AS VARCHAR(10)), 2, 1000) AS DaysOffSick , DATEADD(d, ROW_NUMBER() OVER (PARTITION BY EmployeeNo, DateOfSickness ORDER BY EmployeeNo, DateOfSickness) - 1, DateOfSickness) AS DateOffSick FROM Employees_Attendance_Table JOIN master..spt_values n ON n.type = 'P' AND n.number < CONVERT(int, LEFT(CAST( DaysOffSick AS INT ),3)) -- yields EmployeeNo DaysOffSick DateOffSick -------------- -------------- --------------------- 001 1.00 1997-11-17 002 1.00 2000-02-18 002 1.00 2000-02-19 002 1.00 2000-02-20 003 1.00 1999-02-25 003 1.00 1999-02-26
- SELECT
- EmployeeNo
- , '1' + substring(CAST(DaysOffSick AS VARCHAR(10)), 2, 1000) AS DaysOffSick
- , DATEADD(d, ROW_NUMBER() OVER (PARTITION BY EmployeeNo, DateOfSickness ORDER BY EmployeeNo, DateOfSickness) - 1, DateOfSickness) AS DateOffSick
- FROM
- Employees_Attendance_Table
- JOIN
- master..spt_values n
- ON n.type = 'P'
- AND n.number < CONVERT(int, LEFT(CAST( DaysOffSick AS INT ),3))
- -- yields
- EmployeeNo DaysOffSick DateOffSick
- -------------- -------------- ---------------------
- 001 1.00 1997-11-17
- 002 1.00 2000-02-18
- 002 1.00 2000-02-19
- 002 1.00 2000-02-20
- 003 1.00 1999-02-25
- 003 1.00 1999-02-26
Some thing(s) to consider:
- User running the SQL query needs permission to read the master database.
- If run by a service account, ask a DBA to emulate the user for testing.
- the Items column may vary from 1 and 10+. Not sure on why this is but it doesn't seem to affect the functionality of the above and you could always hardcode a 1.00 if its confusing.
- In theory, if you know the shift patterns of an employee, you could enhance the above to skip day-off dates (eg. Saturday/Sunday - the above does not take this into account - my day-job company is a 24/7 business).