The Dangers of Algorithms

  • Wayne West (3/1/2016)


    So many companies are going to fight tooth and claw to prevent their algorithms from being disclosed as they are trade secrets. And our lives are affected by this. I'm personally hit -- Experian conflated three traffic accidents that my dad had -- 500 miles from where I live in a car that I don't own -- with my driving record and now I can't change my insurance. It's not a huge deal since my insurance knows that it's not me who had the accidents and my rate didn't go up, but what else might this affect? And how did I find out? By trying to get insurance quotes and finding out that I couldn't. It's not like Experian was transparent and told me about what information they have on me.

    Would it be good if algorithms were public knowledge? Well, how many people use Linux because they can audit the code and make modifications? Not a lot. It takes experts to analyze these things and run them through simulations to see if predictable outputs can be generated.

    And then you have Expert Wars. My expert looks at your algorithm and says 'Red', your expert says 'Blue'. Who is correct? Who is going to pay to bring in a third expert and hope they verify your result and not your opponents, hoping that they don't say 'Green'?

    I think it's a big mess and that at a societal level that we can't do much to clean it up. Not unlike politics, there's too much money at stake. I can't begin to speculate how many or whose databases have been accumulating information on me, so how can I ask them what info they have on me and how they gather it, much less 'can I see your algorithm?'

    I don't want things public. That creates issues. But they need to be able to be audited, independently.

    Can two experts disagree? Sure. We see this all the time in all sorts of engineering disputes, medical issues, financial ones, etc. The point is there can be some sort of group to weight both sides. Jury, arbitrator, judge, whatever.

    I'm not asking for things to be 100% correct, but rather that they can be judged.

  • Gary Varga (3/1/2016)


    Wayne West (3/1/2016)


    ...how many people use Linux because they can audit the code and make modifications? Not a lot...

    More choose Linux because they could audit the code and make modifications if they chose to do so than choose it because they intend to.

    Agreed, and I understand that angle. It would be interesting to know how many people modify Linux source, for whatever reason, compared to just using it. I'm talking modifying actual code that compiles down to binaries, not just shell scripting.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • Wayne West (3/1/2016)


    Gary Varga (3/1/2016)


    Wayne West (3/1/2016)


    ...how many people use Linux because they can audit the code and make modifications? Not a lot...

    More choose Linux because they could audit the code and make modifications if they chose to do so than choose it because they intend to.

    Agreed, and I understand that angle. It would be interesting to know how many people modify Linux source, for whatever reason, compared to just using it. I'm talking modifying actual code that compiles down to binaries, not just shell scripting.

    The only person I have ever been sure did was someone involved in the development of SMB. Everyone else I have spoken to says at most that they CAN. For some it is an interesting prospect, others a bragging right but for most irrelevant.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Almost no one modifies the code. I know a few companies that did it with PostgreSQL, but they regretted it. They created a fork and patches/changes from the mainline become an issue.

    The case with FOSS if often that you can pay someone to look at the code and figure out why things don't work more than you making changes. Larger companies will make changes, but they contribute back to the mainline and hope to get things into the next upgrade.

  • I don't want things public. That creates issues. But they need to be able to be audited, independently.

    .

    I want all algorithms public. Algorithms published in peer-reviewed journals have a greater chance of being provably correct. Implementation? not so much.

    Gerald Britton, Pluralsight courses

  • Steve Jones - SSC Editor (3/1/2016)


    David.Poole (2/29/2016)


    If I bought Eddie Van Halen's guitar it wouldn't make me a great guitar player.

    Are you sure? Have you tried?

    I haven't got the passion and dedication and my son has purloined my guitar.

    I do have the passion and dedication for the piano

  • djackson 22568 (2/29/2016)


    Kyrilluk (2/29/2016)


    The real issue from the perspective of the left or extreme-left London School Of Economics about using algorithms, is that they tend to confirm prejudices instead of dispelling them. They have a problem with reality. Now, because they can't argue with an algorithm they are pushing for legislation that would force a firm to stop using an algorithm that doesn't adhere to left leaning policies or ideology.

    It reminds me of the fact that in the US, it is not authorized to use IQ test to assess whether a person will be a successful hire. This is banned, not because IQ tests are bad predictors of related working ability but because any intelligence test will discriminate against certains minorities (not all of them: the Asians are doing pretty well on these tests). Companies such as Microsoft or Google goes around the ban by re-branding their tests but the result is the same: a highly proficient working force but that lack of a certain diversity (please remember that the Asians, that are part of the "Diversity" in USA are over-represented while the white, the majority, are underpresented in companies such as Google or Facebook).

    What bothers the LSE is that even if you don't feed the ethnicity of a person, algorithms are going to use the next best things to predict a person behavior. People behavior being conditioned by ethnicity, whether it is the post code, the type of trainers someone has bought, the kind of food someone consumes or the amount of jail time someone spent, the end result will always have a strong "race" component. That's reality for you.

    Because, in some instance, it is impossible to know what kind of precise algorithm has been used to classify such or such behavior, it makes suing the bastard pretty impossible. So that's why the LSE and probably other lobbies are going to push for a posteriory control of the use of an algorithm. Problem being that even this is not very practical because models can change without prior warning.

    Good to see someone else willing to speak out about this.

    Sad to see support for Kyrilluk's distortment of reality.

    Tom

  • Good article.

    I don't agree with you that IPR should be a reason for not publishing algorithms and code. I know it wasn't an editorial about IPR, but I think your statement in the article is utterly wrong. Either there's something to patent (in the United States, at least ; in the EU not in theory but apparently maybe in practise) or there's nothing wrong with the protection provided by copyright. And given the way IP protection is being abused by all the big tech companies I wouldn't be too unhappy if publishinhg the algorithms caused some failures of IP protection. ANd teh way teh European Patent Office is currently being run and it's incredible legal status I think it would be a good idea to scrap the EPO (along with the US Patent Office, whic has need scrapping for decades) and start again with new organisations so structured and regulated as to give them some incentive to prevent utter bullshit patents with no inventive step whatever being accepted and put a reasonable limit on teh life of patents. And maybe put some reasonable limit on copytright duration instead of upping it every time a big player indicates that campaign donations might depend on doing that.

    Tom

  • TomThomson (3/1/2016)


    djackson 22568 (2/29/2016)


    Kyrilluk (2/29/2016)


    The real issue from the perspective of the left or extreme-left London School Of Economics about using algorithms, is that they tend to confirm prejudices instead of dispelling them. They have a problem with reality. Now, because they can't argue with an algorithm they are pushing for legislation that would force a firm to stop using an algorithm that doesn't adhere to left leaning policies or ideology.

    It reminds me of the fact that in the US, it is not authorized to use IQ test to assess whether a person will be a successful hire. This is banned, not because IQ tests are bad predictors of related working ability but because any intelligence test will discriminate against certains minorities (not all of them: the Asians are doing pretty well on these tests). Companies such as Microsoft or Google goes around the ban by re-branding their tests but the result is the same: a highly proficient working force but that lack of a certain diversity (please remember that the Asians, that are part of the "Diversity" in USA are over-represented while the white, the majority, are underpresented in companies such as Google or Facebook).

    What bothers the LSE is that even if you don't feed the ethnicity of a person, algorithms are going to use the next best things to predict a person behavior. People behavior being conditioned by ethnicity, whether it is the post code, the type of trainers someone has bought, the kind of food someone consumes or the amount of jail time someone spent, the end result will always have a strong "race" component. That's reality for you.

    Because, in some instance, it is impossible to know what kind of precise algorithm has been used to classify such or such behavior, it makes suing the bastard pretty impossible. So that's why the LSE and probably other lobbies are going to push for a posteriory control of the use of an algorithm. Problem being that even this is not very practical because models can change without prior warning.

    Good to see someone else willing to speak out about this.

    Sad to see support for Kyrilluk's distortment of reality.

    Sad that you assumed things I didn't say. I am glad others are willing to speak out about people on the left trying to force feed their opinions as science, despite evidence that they are wrong. His point about IQ tests is particularly noteworthy - a test is not discriminatory! I did not call out any particular opinion until now though.

    Dave

  • djackson 22568 (3/1/2016)


    Steve Jones - SSC Editor (3/1/2016)


    My point, perhaps poorly made, was that algorithms can be well or poorly implemented. They can be beneficial or harmful, or even malicious, to the use case. They can be appropriate or illegal, but they should be understood by someone, and more importantly, able to be examined or disclosed.

    Well said, but maybe a qualification would improve the statement? Able to be examined or disclosed under the right conditions - which is where difficulty ensues as getting people to agree about what the right conditions are, is difficult.

    The issue is closing your eyes and ears and trusting the decision making to machines which although good often cannot pick out patterns as efficiently as we do. They understand certain patterns very well, but some they never really master. Without a brain to balance out multiple contravening pattern matching algorithms (which we seem to be tooled with), algorithms are easy to fool once you understand them.

    The flaw really is that there is nothing instrinsically good or bad with the algorithm, it's that we tend to impute too much onto it. So often we misapply it or presume that it will continue to get balanced data that will no cause it to drift one way or another. Just like the report you build that is used to justify something it isn't built to even discuss, we often build out algorithms we try to use to predict or monitor behavior they don't fully understand or capture.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Matt Miller (#4) (3/1/2016)


    ...Without a brain to balance out multiple contravening pattern matching algorithms (which we seem to be tooled with), algorithms are easy to fool once you understand them...

    Spot on.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Humans are good at spotting patterns even when there isn't one.

    An algorithm will be applied consistently and with out bias or prejudice.

    Can you fool them if you understand them? Google Derren Brown.

    Remember humans are susceptible to the hippo principle. Highest paid persons opinion

  • All good points David.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • g.britton (3/1/2016)


    I don't want things public. That creates issues. But they need to be able to be audited, independently.

    .

    I want all algorithms public. Algorithms published in peer-reviewed journals have a greater chance of being provably correct. Implementation? not so much.

    I work a lot with home-brewed algorithms. It's hard to get any organization to release their secret sauce. What is done instead is most companies that use them as a service and still need to validate data for the ROI is take the time to explain what is going on with research, test results and white paper to back it up. But, no one is going to release their algorithm to the public much like the company you work for is not going to release their source code to the public unless you work entirely in open source.

    I think a good example of this most hidden and yet teased about algorithm would be Google rankings algorithm. Why won't they release it? Because once you know how the algorithm works, people will game the system. Sites like SQL Server Central could be greatly impacted by the knowledge of that algorithm because others find ways to game it. So, what Google does is gives us hints on how to improve ourselves in order to take advantage of the algorithm to get better site rankings when someone searches for our content. They use the secret to do good rather than promote harm.

    From my end, I've screwed up a few algorithms by choosing the wrong variables due to poor planning and poor execution on my part.

  • I don't want things public. That creates issues. But they need to be able to be audited, independently.

    Can two experts disagree? Sure. We see this all the time in all sorts of engineering disputes, medical issues, financial ones, etc. The point is there can be some sort of group to weight both sides. Jury, arbitrator, judge, whatever.

    I'm not asking for things to be 100% correct, but rather that they can be judged.

    If we are still speaking about data mining algorithms, the issue is that some algorithm - such as Neural Networks - cannot be audited. This is why they are called "black box" algorithms. When I say that they cannot be audited, I mean that although it is in theory possible, in practice, you will need to have your expert perusing your models for month before being able to work out why one particular decision was taken (for example why your loan was refused). And by then, the algorithm would have produced a different model any way.

    To try to prove that an algorithm is biased against you or made a deliberate mistake (or that you are been discriminated against) is too costly. As I stated earlier on, the only way, is to judge the result (you didn't get the loan so the bank manager or the algorithm must therefore be racist/sexist/islamophob/homophob/minority-o-phob/etc) and not the algorithm.

Viewing 15 posts - 31 through 45 (of 53 total)

You must be logged in to reply to this topic. Login to reply