Best way to save a matrix of Data approx 1000 cols by 700 rows?

  • Hi all,

    Does anyone have any suggestions for loading and storing a matrix of data that changes in either dimension on a daily basis.

    I have to keep each days data together as single entity. I know this is not what an rdbms is designed for but its what I have.

    Feel free to chip in with anything sensible.

    Thanks

    Will

     

     

  • What about converting it into a XML document and when store it as xml datatype ?

    Markus

    [font="Verdana"]Markus Bohse[/font]

  • Thanks Markus,

    Would there be a big overhead in then converting the xml datatype to columns and rows , once I had it in the db as xml? is it possible to do?

    I'm in a new environment here with SQL 2005 and havent used the xml datatype before.

    All tips gratefully received.

    Will 

     

  • I think I'd just go with a simple table - row# or name, col# or name, value. So for a 3x3 matrix, you might have these

    1, 1, "A"

    1, 2, "B"

    1, 3, "C"

    2, 1, "D"

    2, 2, "E"

    Etc

    That will store any matrix you want. The easiest way to handle changes would be to delete all the rows for  the matrix from the table and then add the new matrix mack in a transaction. A fancier implementation might just handle the add/change/delete done to the matrix, but it would be some extra work and you'd have to work to minimize round trips.

     

  • Hi,

    did you think about storing your data in a star schema with columns and rows as dimensions adding a date dimension to allow you to identify the data belonging to a certain day?

    You can then add data on a daily basis and decide to keep a certain range of days or delete them if not needed any more.

    Michael

  • Thanks Michael,

    I have no prior knowledge of datawarehousing so I'll read up on your suggestion. I'm working my way through the SSIS loading mechanism now.

    Thanks to everybody for your help and keep all the suggestions coming please.

    Will

  • You need to have a better idea of that 1000x700 matrix. Besides the dimensions changing you need to ask yourself:

    How sparse is it?

    How many entries change between one day and the next?

    How much and what kinds of data is there in each matrix location?

    Without answers to at least these questions there will be no meaningful answer.

  • Hi Michael ,

    All valid questions,

    How sparse is it?

    All fields in the matrix would be populated the number of columns could vary widely on a daily basis .day1 1200 cols, day2 950 cols, day2 800 cols etc. the number of rows is pretty static by comparison between 650 and 700.

    How many entries change between one day and the next?

    Almost all of the data would change on a daily basis and each days data must be saved in its entirety . I would not ever want to update any data.

    How much and what kinds of data is there in each matrix location?

    A row of financial ids, One col of business dates and the rest of the data would be numeric(18,2) held as varchar as no computation would need to be performed on the *numeric* data.

    What I have done is count the number of rows in the input dataset , create a table on the fly using the rownumber as the number of columns and create a table. I then transpose the data so datacolumns become datarows, that gets me around hitting the max number of columns in a sql table 1024.

    I'm just wondering how other guys would skin this cat.

    Will

  • So you have

    null     fid1    fid2    fid3  ...

    date1  val11  val12  val13 ...

    date2  val21  val22  val23 ...

    date3  val31  val32  val33 ...

    .      .      .      .

    .      .      .      .

    .      .      .      .

    I presume you receive this as a text file of some sort.

    Certainly you can save it into a table blob without great difficulty.

    Somewhere in this forum I published some VBScript that lets you do this quite easily.

    The other issue is that if there is a frequent need to do something with this data

    then you might want to pre-process the original into some format that lets you then

    quickly load it into a real SQL Server table - like what Andy mentioned.

  • Michael ,

    Its like you've seen the file already, yep I receive it as an overnight tab delimited text file.

    My transpose stored proc does exactly what Andy suggested (thanks Andy) anyway,

    My users would would search for a matrix by LoadDate and for all or a subset of positions within the matrix so would the blob still cut it for that ?

    Can I rip up a blob into cols and rows ? once I had the data in sql blob field?

    Will

  • I have never 'ripped up' a blob like you want to do. I would stay away from doing

    this in a stored procedure - as it would be very slow. Neither do I know if you

    can manipulate a blob in a stored procedure using column variables or @-variables. As I said earlier, the only way I have ever dealt with getting blobs in and out is via VBScript.

    Thus the problem is one of transforming the original text file into something and store it

    in a blob. The format should be such that it can quickly be unloaded from the blob

    and transformed into a real SQL Server table.

    The things that come to my mind are:

    1. Transform it into MS Access. Once unloaded to a file you can then get at the data

    using a SQL Server link.

    2. Transform it into a SQL Server table. Surely there is a way to export/import a single

    table using DTS (don't know).

    3. XML - although I haven't dealt much with this on SQL Server.

    The problem is that if you need to save the original then you would have to deal

    with two blobs.

    The technical solution would depend a lot on the application context in which all this

    manipulation occurs.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply