Uncontrolled Code

  • Gary Varga (1/23/2014)


    EXCEL IS NOT A DBMS!

    If you are happy with that then so am I as I think that the statement above captures more succinctly what the issue is with using Excel in the ways that it often is and covers both Jim's point and remains a rallying cry.

    Fair enough! Count me in!

  • Of course, what IS a good practice, when its a reporting solution, is to develop properly a high quality data cube and give the Excel users free rein when it comes to using that to create reporting solutions.

    All the complexity and most of the business logic and therefore Risk is taken out and put in the correct tier where it can be developed with due care and attention. You can also be sure it meets the requirements of all parts of the business, then let the users do the part they're usually the best people to do - the presentation and presentation logic tiers - it's fairly safe for them to do those reports, graphs, KPIs etc without the usual overhead of a Dev team.

    It'll also isolate them from the underlying changes in source data - you can cope with that in your ETL.

  • chris.smith 91049 (1/23/2014)


    Of course, what IS a good practice, when its a reporting solution, is to develop properly a high quality data cube and give the Excel users free rein when it comes to using that to create reporting solutions.

    All the complexity and most of the business logic and therefore Risk is taken out and put in the correct tier where it can be developed with due care and attention. You can also be sure it meets the requirements of all parts of the business, then let the users do the part they're usually the best people to do - the presentation and presentation logic tiers - it's fairly safe for them to do those reports, graphs, KPIs etc without the usual overhead of a Dev team.

    It'll also isolate them from the underlying changes in source data - you can cope with that in your ETL.

    What about drill to detail, then export and manipulate?

    Depending on what you expose, and what they try to do with it, troubles can arise.

    Some might also depend on 1 cube or multiple cubes.

    Mixing data could be a possible issue.

    We had a cube for a phone system, and one for our ERP.

    I had a Customer Service Manager think these could be spliced together to create a report.

    Sales had very little to do with some of the activity logged in the phone system.

  • Greg Edwards-268690 (1/23/2014)


    chris.smith 91049 (1/23/2014)


    Of course, what IS a good practice, when its a reporting solution, is to develop properly a high quality data cube and give the Excel users free rein when it comes to using that to create reporting solutions.

    All the complexity and most of the business logic and therefore Risk is taken out and put in the correct tier where it can be developed with due care and attention. You can also be sure it meets the requirements of all parts of the business, then let the users do the part they're usually the best people to do - the presentation and presentation logic tiers - it's fairly safe for them to do those reports, graphs, KPIs etc without the usual overhead of a Dev team.

    It'll also isolate them from the underlying changes in source data - you can cope with that in your ETL.

    What about drill to detail, then export and manipulate?

    Depending on what you expose, and what they try to do with it, troubles can arise.

    Some might also depend on 1 cube or multiple cubes.

    Mixing data could be a possible issue.

    We had a cube for a phone system, and one for our ERP.

    I had a Customer Service Manager think these could be spliced together to create a report.

    Sales had very little to do with some of the activity logged in the phone system.

    Yeah, it's hard to completely remove any chance of end-user error while end-users are still involved in developing in some way. A good compromise, I think, is still to give them cubes, but also have a somewhat more data-literate MI Team, whose skill lie within desktop products, like MS Office, but do have a more fundamental understanding - they can give the flexibility and speed required when it's something straightforward like reports from MI, but also should have the experience to know when there's something that needs escalating up to the development team and be dealt with in the full SDLC.

  • Excel can be a cog in a large organization's data processing lifecycle, somthing that exists peripheral to the RDMS, but that's not always the case.

    There are a lot of small organizaions and individuals who use Excel to do things like manage their household budget or keep track of tax deductable expenses. There is no relational database, and they don't have an IT department to set one up for them. For these guys, it would be very beneficial for Excel to support typed columns.

    There is nothing inherently wrong with using spreadsheets like Excel as a database for small sets of data, the problem is that front end tool that manages them is too flexible. The focus of these tools are on presentation and ease of use, not data integrity and accuracy.

    For example, if a column contains the values 5.5 and 7.25, Excel will assume numeric datatype for those (2) cells, and the SUM for the column will be 12.75. However, if the user accidentally enters 8,5 in the 3rd cell, Excel will assume a data type of text for that specific cell, and the SUM will still be 12.75. By default, there is no input validation and no visual queue that one of the cells has been excluded from the total.

    If the user could simply click on a column header and hard type it as something like numeric(9,1), then an attempt to enter an invalid value would be rejected and result in a warning. That's a small thing, but it would make all the difference in the world.

    Yes, you can do this with some VBA code, but only a geek would expect non-IT users to program VBA scripts or expect such a basic feature to require VBA. Applying a strong data type to a column in a speadsheet should be a matter of just making two or three clicks.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (1/24/2014)


    Excel can be a cog in a large organization's data processing lifecycle, somthing that exists peripheral to the RDMS, but that's not always the case.

    There are a lot of small organizaions and individuals who use Excel to do things like manage their household budget or keep track of tax deductable expenses. There is no relational database, and they don't have an IT department to set one up for them. For these guys, it would be very beneficial for Excel to support typed columns.

    There is nothing inherently wrong with using spreadsheets like Excel as a database for small sets of data, the problem is that front end tool that manages them is too flexible. The focus of these tools are on presentation and ease of use, not data integrity and accuracy.

    For example, if a column contains the values 5.5 and 7.25, Excel will assume numeric datatype for those (2) cells, and the SUM for the column will be 12.75. However, if the user accidentally enters 8,5 in the 3rd cell, Excel will assume a data type of text for that specific cell, and the SUM will still be 12.75. By default, there is no input validation and no visual queue that one of the cells has been excluded from the total.

    If the user could simply click on a column header and hard type it as something like numeric(9,1), then an attempt to enter an invalid value would be rejected and result in a warning. That's a small thing, but it would make all the difference in the world.

    Yes, you can do this with some VBA code, but only a geek would expect non-IT users to program VBA scripts or expect such a basic feature to require VBA. Applying a strong data type to a column in a speadsheet should be a matter of just making two or three clicks.

    I am sorry Eric but what end user would understand numeric(9,1)? Maybe if it was something simple like number otherwise you are getting into the realms of someone who could do more for themselves.

    Having said that, there are plenty of cheap(ish) apps targeted at budgeting etc.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Jim P. (1/23/2014)


    It is us in IT whose responsibility is to highlight how a solution can be provided with as many of the risks mitigated without stripping the users of their productivity.

    Nowhere did I say strip the users of capabilities. But if you are going to make Excel a cornerstone of company decisions I hope your company has St. Simeon as their saint.

    Do you not mean http://en.wikipedia.org/wiki/Jude_the_Apostle ?

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Gary Varga (1/24/2014)


    Eric M Russell (1/24/2014)


    Excel can be a cog in a large organization's data processing lifecycle, somthing that exists peripheral to the RDMS, but that's not always the case.

    There are a lot of small organizaions and individuals who use Excel to do things like manage their household budget or keep track of tax deductable expenses. There is no relational database, and they don't have an IT department to set one up for them. For these guys, it would be very beneficial for Excel to support typed columns.

    There is nothing inherently wrong with using spreadsheets like Excel as a database for small sets of data, the problem is that front end tool that manages them is too flexible. The focus of these tools are on presentation and ease of use, not data integrity and accuracy.

    For example, if a column contains the values 5.5 and 7.25, Excel will assume numeric datatype for those (2) cells, and the SUM for the column will be 12.75. However, if the user accidentally enters 8,5 in the 3rd cell, Excel will assume a data type of text for that specific cell, and the SUM will still be 12.75. By default, there is no input validation and no visual queue that one of the cells has been excluded from the total.

    If the user could simply click on a column header and hard type it as something like numeric(9,1), then an attempt to enter an invalid value would be rejected and result in a warning. That's a small thing, but it would make all the difference in the world.

    Yes, you can do this with some VBA code, but only a geek would expect non-IT users to program VBA scripts or expect such a basic feature to require VBA. Applying a strong data type to a column in a speadsheet should be a matter of just making two or three clicks.

    I am sorry Eric but what end user would understand numeric(9,1)? Maybe if it was something simple like number otherwise you are getting into the realms of someone who could do more for themselves.

    Having said that, there are plenty of cheap(ish) apps targeted at budgeting etc.

    I expect most end user of a spreadsheet to know what numeric data type with two decimal places means; they just need a simple dialog driven method to implement it. For example, the user would click on a column and then use the existing Format Cells.. dialog box to choose Numeric data type with 2 decimal places. The process would be identical to applying format to a range of cells, except in this context there would also be an additional check box that would indicate wether the column formatting would be strongly typed, meaning that Excel would not accept non-numeric data entry into any cell within that column. The Paid Amt column should only contain numeric data, and the Zip Code column would accept only input from a predefined list of valid zip codes, which the user has pasted into another sheet.

    Here is an example of why it matters:

    Did Harvard Economists Make an Excel Error that Led to Economic Austerity?

    http://finance.yahoo.com/blogs/daily-ticker/did-harvard-economists-excel-error-lead-economic-austerity-185852805.html

    True, other database applications like FileMaker Pro and MS Access support actual tables, and some would argue based on stories like the one above that Excel was not appropraite for the job. But users shouldn't necessarily have to choose between strongly typed schema versus spreadsheets. It's conceptually possible to overlay schema on a column/record/cell based spreadsheet.

    There are other 3rd parties like Google and OpenOffice who are developing front end applications for managing Excel spreadsheets. I'd hate for Microsoft to get beat on their own home turf. If they are marketing Excel as a serious data analysis tool, then they should provide features to help insure the data at least meets a minimum level of conformity. What I'm proposing above is a simple solution to a common and potentially serious problem.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • andrew gothard (1/24/2014)


    Do you not mean http://en.wikipedia.org/wiki/Jude_the_Apostle ?

    If you read the St Simeon link he is considered the patron saint of fools.



    ----------------
    Jim P.

    A little bit of this and a little byte of that can cause bloatware.

  • Eric M Russell (1/24/2014)


    Gary Varga (1/24/2014)


    Eric M Russell (1/24/2014)


    Excel can be a cog in a large organization's data processing lifecycle, somthing that exists peripheral to the RDMS, but that's not always the case.

    There are a lot of small organizaions and individuals who use Excel to do things like manage their household budget or keep track of tax deductable expenses. There is no relational database, and they don't have an IT department to set one up for them. For these guys, it would be very beneficial for Excel to support typed columns.

    There is nothing inherently wrong with using spreadsheets like Excel as a database for small sets of data, the problem is that front end tool that manages them is too flexible. The focus of these tools are on presentation and ease of use, not data integrity and accuracy.

    For example, if a column contains the values 5.5 and 7.25, Excel will assume numeric datatype for those (2) cells, and the SUM for the column will be 12.75. However, if the user accidentally enters 8,5 in the 3rd cell, Excel will assume a data type of text for that specific cell, and the SUM will still be 12.75. By default, there is no input validation and no visual queue that one of the cells has been excluded from the total.

    If the user could simply click on a column header and hard type it as something like numeric(9,1), then an attempt to enter an invalid value would be rejected and result in a warning. That's a small thing, but it would make all the difference in the world.

    Yes, you can do this with some VBA code, but only a geek would expect non-IT users to program VBA scripts or expect such a basic feature to require VBA. Applying a strong data type to a column in a speadsheet should be a matter of just making two or three clicks.

    I am sorry Eric but what end user would understand numeric(9,1)? Maybe if it was something simple like number otherwise you are getting into the realms of someone who could do more for themselves.

    Having said that, there are plenty of cheap(ish) apps targeted at budgeting etc.

    I expect most end user of a spreadsheet to know what numeric data type with two decimal places means; they just need a simple dialog driven method to implement it. For example, the user would click on a column and then use the existing Format Cells.. dialog box to choose Numeric data type with 2 decimal places. The process would be identical to applying format to a range of cells, except in this context there would also be an additional check box that would indicate wether the column formatting would be strongly typed, meaning that Excel would not accept non-numeric data entry into any cell within that column. The Paid Amt column should only contain numeric data, and the Zip Code column would accept only input from a predefined list of valid zip codes, which the user has pasted into another sheet.

    Here is an example of why it matters:

    Did Harvard Economists Make an Excel Error that Led to Economic Austerity?

    http://finance.yahoo.com/blogs/daily-ticker/did-harvard-economists-excel-error-lead-economic-austerity-185852805.html

    True, other database applications like FileMaker Pro and MS Access support actual tables, and some would argue based on stories like the one above that Excel was not appropraite for the job. But users shouldn't necessarily have to choose between strongly typed schema versus spreadsheets. It's conceptually possible to overlay schema on a column/record/cell based spreadsheet.

    There are other 3rd parties like Google and OpenOffice who are developing front end applications for managing Excel spreadsheets. I'd hate for Microsoft to get beat on their own home turf. If they are marketing Excel as a serious data analysis tool, then they should provide features to help insure the data at least meets a minimum level of conformity. What I'm proposing above is a simple solution to a common and potentially serious problem.

    Please review Data Validation in Excel. You can define types of data acceptable for input into the range.

    Doesn't require coding, just setting it up.

    Zip Code might be a slightly different twist.

    A Zip Code itself can be perfectly valid, but not be correct for the address.

    I find the link interesting.

    They say Excel error, but never mention what is the actual error.

    And Selective use of data - that is someone making up their own rules.

    Mistake? Or intentional?

    I have seen both, and seen it done in Excel as well as using filters on the data in formal tools.

    Which makes Context being front and center a real topic too.

  • Greg Edwards-268690 (1/24/2014)


    Eric M Russell (1/24/2014)


    There are other 3rd parties like Google and OpenOffice who are developing front end applications for managing Excel spreadsheets. I'd hate for Microsoft to get beat on their own home turf. If they are marketing Excel as a serious data analysis tool, then they should provide features to help insure the data at least meets a minimum level of conformity. What I'm proposing above is a simple solution to a common and potentially serious problem.

    Please review Data Validation in Excel. You can define types of data acceptable for input into the range.

    Doesn't require coding, just setting it up.

    Zip Code might be a slightly different twist.

    A Zip Code itself can be perfectly valid, but not be correct for the address.

    I find the link interesting.

    They say Excel error, but never mention what is the actual error.

    And Selective use of data - that is someone making up their own rules.

    Mistake? Or intentional?

    I have seen both, and seen it done in Excel as well as using filters on the data in formal tools.

    Which makes Context being front and center a real topic too.

    Thank you! That's what I'm talking about; although it doesn't exactly apply to columns as a whole but rather a range of cells (which could be the entire column if that's what the user selects). Not selecting the top row leaves room for column headers, but it would help if column headers were not treated as just text in a row, but seperate from the data portion of the sheet.

    Apply data validation to cells

    http://office.microsoft.com/en-us/excel-help/apply-data-validation-to-cells-HP010072600.aspx

    This story goes into a little more detail about what type of Excel mistake the economists made.

    FAQ: Reinhart, Rogoff, and the Excel Error That Changed History

    http://www.businessweek.com/articles/2013-04-18/faq-reinhart-rogoff-and-the-excel-error-that-changed-history

    Reinhart and Rogoff's Second Response To Critique Of Their Research

    http://www.huffingtonpost.com/mark-gongloff/reinhart-rogoff-research-response_b_3099185.html

    .. They admit they accidentally excluded five rows from an average in their Microsoft Excel spreadsheet, but not the other charges. Fixing the spreadsheet error would lift growth in those high-debt countries to about 0.2 percent annually ..

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell (1/24/2014)


    Here is an example of why it matters:

    Did Harvard Economists Make an Excel Error that Led to Economic Austerity?

    http://finance.yahoo.com/blogs/daily-ticker/did-harvard-economists-excel-error-lead-economic-austerity-185852805.html

    Here's a few reasons why I'm happy to specifically exclude that incident entirely as a reason for anything

    - these were folks who needed to be on a high enough level that they should have known better to do something as simple as checking their work,

    - research with important conclusions should always be independently verified, and in fact, I believe it was some of the independent verification that actually uncovered the errors and

    - despite the mistakes in their original work, the original economists are standing by their conclusions, so in that case, why did the spreadsheet even matter, and

    - its economics, which has different subjective schools of the "science" to such an extreme as to make political science look like basic arithmetic.

    Now I'm sure someone will make the argument that "this research was used as support for austerity" but I'm absolutely willing to bet that the austerity position was already held by those who simply picked up the research and ran with it in support of positions already held.

  • patrickmcginnis59 10839 (1/24/2014)


    Eric M Russell (1/24/2014)


    Here is an example of why it matters:

    Did Harvard Economists Make an Excel Error that Led to Economic Austerity?

    http://finance.yahoo.com/blogs/daily-ticker/did-harvard-economists-excel-error-lead-economic-austerity-185852805.html

    Here's a few reasons why I'm happy to specifically exclude that incident entirely as a reason for anything

    - these were folks who needed to be on a high enough level that they should have known better to do something as simple as checking their work,

    - research with important conclusions should always be independently verified, and in fact, I believe it was some of the independent verification that actually uncovered the errors and

    - despite the mistakes in their original work, the original economists are standing by their conclusions, so in that case, why did the spreadsheet even matter, and

    - its economics, which has different subjective schools of the "science" to such an extreme as to make political science look like basic arithmetic.

    Now I'm sure someone will make the argument that "this research was used as support for austerity" but I'm absolutely willing to bet that the austerity position was already held by those who simply picked up the research and ran with it in support of positions already held.

    Yes, often times politicians, activists, and social media selectively compile "facts" to support their narrative, rather than using narrative to present objective and independently verifiable facts.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Jim P. (1/24/2014)


    andrew gothard (1/24/2014)


    Do you not mean http://en.wikipedia.org/wiki/Jude_the_Apostle ?

    If you read the St Simeon link he is considered the patron saint of fools.

    Indeed.

    St Jude is the Patron Saint of Lost Causes though

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • But users shouldn't necessarily have to choose between strongly typed schema versus spreadsheets. It's conceptually possible to overlay schema on a column/record/cell based spreadsheet.

    Good god man, it's hard enough to get the concept of "USE APPROPRIATE DATATYPES AND SANITISE INPUT FFS" through the skulls of a hell of a lot of "Developers" :doze: nowadays :w00t:. Good luck with getting end users to get into the concept

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

Viewing 15 posts - 76 through 89 (of 89 total)

You must be logged in to reply to this topic. Login to reply