The Dangers of Algorithms

  • Wayne West (2/29/2016)


    suparSteve (2/29/2016)


    ...So at this point, it's effectively impossible to determine whether the algorithm made a mistake, because the model probably resists human comprehension.[/i] So the concept of examining the code makes less and less sense, the best you can do perhaps is run it on training sets and see if you get good results, however if you are doing unsupervised training that's not going to help so much.

    And this is why I'm not too keen on IBM's Dr. Watson being the sole source for diagnosing my medical problems! :ermm:

    I actually would be, not as a sole decision maker, but as an input. Plenty of times human doctors aren't really sure either. They end up going with a generic response, often, and don't have reasons other than "this worked for someone else."

  • Steve Jones - SSC Editor (2/29/2016)


    Wayne West (2/29/2016)


    suparSteve (2/29/2016)


    ...So at this point, it's effectively impossible to determine whether the algorithm made a mistake, because the model probably resists human comprehension.[/i] So the concept of examining the code makes less and less sense, the best you can do perhaps is run it on training sets and see if you get good results, however if you are doing unsupervised training that's not going to help so much.

    And this is why I'm not too keen on IBM's Dr. Watson being the sole source for diagnosing my medical problems! :ermm:

    I actually would be, not as a sole decision maker, but as an input. Plenty of times human doctors aren't really sure either. They end up going with a generic response, often, and don't have reasons other than "this worked for someone else."

    Medicine and its practice are an odd thing. After my second or third pneumonia, my wife had correctly diagnosed my immune disorder via internet research (her father was a pathologist). My GP wouldn't order the blood test that would diagnose me even though we would not have expected him to treat me. A lung doctor, when presented with x-rays of before and after images, plus pathology reports, plus blood work, said I didn't have pneumonia. Even the immunologist thought I was having severe bronchitis. Fortunately my wife brow-beat him (as a prelude to actually beating him) in to running the immuneglobin test, and surprise! she was right. By the time I got the correct test run, I had had pneumonia five times.

    A lot of doctors hate the internet because people, tinted with some hypochondria, self-diagnose too much and don't like not getting treated. My wife and I are a bit paranoid about my wife, but when your body doesn't produce immuneglobin, you gotta be. My current immunologist (the prev one retired) appreciates the way of how thoroughly that I organize and maintain my medical information, part of my DBA background.

    Watson might have helped me get a quicker diagnosis. Common Variable Immunodeficiency (CVID) is the second most common cause of recurrent pneumonia, it's possible that a doctor would have weighed Watson's suggestion higher than the actual sick person's and ordered the test sooner, saving me a world of grief. But that was 2009, hopefully I won't have a need for Watson in the future.

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • Unfortunately there are too many reports of misdiagnoses due to the old school "algorithms" doctors learn in school. The AMA and CDC seem to have egos that prevent them from even questioning their own algos, especially when faced with public scrutiny such as mercury in vaccines or fillings. The story of lyme disease is a good one of how it takes major fights with the big organizations to get them to see the error of their ways.

    Of course at the lowly DBA or application level, simply forcing users to make decisions on categories like "sex" may be introducing small errors. One per thousand people are born indeterminate. If we multiply this slight error by the number of types, classes, or checkboxes filled in per customer, we come up with quite a high number of unreliability, like the O-ring on the shuttle. I've always felt belittled by reducing myself to salary ranges, marital status, age range, etc.

    In the case of revealing algorithms, it just leads to fraud. People try to hack the google ranking all the time. In the case of horse racing, revealing the tests used for drug detection would also lead to chemistry hacking.

  • If I bought Eddie Van Halen's guitar it wouldn't make me a great guitar player.

    A great programmer can come up with something amazing on surprisingly limited kit where as a poor programmer can create memory leaks out of nothing!

    Businesses seem incapable of learning that it is the craftsman that makes the difference, not the tool. How many times have you seen the business buy that latest shiny toy only to find it disappoints? All the gear and no idea!

    Take a recommendation algorithm and apply it to two different shops. One might predict my behaviour in that Shop well but poorly in the other shop. A factor that is difficult to measure is what my perception of a shop is. Shame it's such an important factor

  • Steve Jones - SSC Editor (2/29/2016)


    suparSteve (2/29/2016)


    ...

    So at this point, it's effectively impossible to determine whether the algorithm made a mistake, because the model probably resists human comprehension. So the concept of examining the code makes less and less sense, the best you can do perhaps is run it on training sets and see if you get good results, however if you are doing unsupervised training that's not going to help so much.

    By the way, so far I've been assuming there's one model. More likely it's a model that's a combination of models. And you thought that view defined inside a view defined inside a view defined inside another view was bad...

    There are a ton of challenges dealing with this stuff, and there are a lot of assumptions and old ways of thinking that might need to be cast aside or at least seriously reconsidered to answer these questions.

    Certainly understanding the model is hard, but it can be done. Not necessarily for us as an individual, though I would hope this is reproducible so that we can determine how things behave if there is cause to doubt.

    This is where transparency or openness is required for organizations when things are called into question.

    I am not entirely certain that we can understand the application of a neural network algorithm. We can understand the model for certain but self learning causes shifts in weighting that we do not necessarily fully understand. The training of a neural network through multiple examples, particularly when numerous, can have outcomes that we struggle with following the reasoning as it is not logic.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • The best IDE tools separate logical design from the physical design while at the same time keeping them integrated into one seamless end-to-end process. For example a business analyst can use it to create a UML diagram, logical data model, or abstract outline of a form or report during the design phase, and then in the development phase a programmer or engineer uses that same document and tool to generate actual code or DDL. There are a handful of good tools that work that way, more or less.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • The fishbowl we swim in is relative truth - not absolute truth. That is what seems to be the thrust of this editorial.

    Neural network algorithms don't change but the data path through them gravitates a certain way depending on the data that runs through them. And the path is ever changing so I don't know if one can even go back to see how choices were made...just as we never remember an early life event as it really happened.

    Genetic algorithms do change by definition...if you allow it to continue to evolve. And the "logic" of its choices exceeds human ability within minutes of running. See my article here from a few years ago. I suppose it would be possible to save the DNA at the time it was used for each process but it would be crazy to do that. Don't we allow people to learn from their mistakes and hopefully improve over time. Maybe we'll need to do that with AI also.

    Teens are being charged for being biased and using stereotypes but that is their current level of brain development. They will grow through it just like most of us have done. It shouldn't be a crime.

  • An algorithm is nothing but math. How can it be dangerous? If the math is correct (how about E=mc^2?) it is what is it is. The fact that this result from general relativity can lead to someone building an atomic bomb doesn't make the math "dangerous" -- the atom bomb builder is the dangerous one. This is not a new problem! Newton's theory of gravity led to the ability to accurately compute artillery trajectories. That doesn't make his theory (or Newton) dangerous either.

    I think it is important to note that math cannot be patented or copyrighted. Granted, a paper presenting a mathematical development can be copyrighted (but not patented) but the math itself cannot. Math is considered a discoverable. That is, no one invents an algorithm (which is just math); the algorithm is discovered.

    From my perspective all algorithms, once proven (mathematically as well as empirically) should be published as soon as practicable.

    Gerald Britton, Pluralsight courses

  • David.Poole (2/29/2016)


    If I bought Eddie Van Halen's guitar it wouldn't make me a great guitar player.

    Are you sure? Have you tried?

  • My point, perhaps poorly made, was that algorithms can be well or poorly implemented. They can be beneficial or harmful, or even malicious, to the use case. They can be appropriate or illegal, but they should be understood by someone, and more importantly, able to be examined or disclosed.

  • Steve Jones - SSC Editor (3/1/2016)


    My point, perhaps poorly made, was that algorithms can be well or poorly implemented. They can be beneficial or harmful, or even malicious, to the use case. They can be appropriate or illegal, but they should be understood by someone, and more importantly, able to be examined or disclosed.

    Well said, but maybe a qualification would improve the statement? Able to be examined or disclosed under the right conditions - which is where difficulty ensues as getting people to agree about what the right conditions are, is difficult.

    Dave

  • Wayne West (2/29/2016)


    There were several problems with electronic voting machines. The problem with trusting voting machine algorithms was that the companies would not allow the code to be audited by independent third parties. When they were subsequently tested, they were found to be hideously insecure and inconsistent, not to mention some losing data when they crashed or lost power. A huge number of electronic voting machines that were produced following the 2000 elections have been scrapped at significant cost and waste.

    Electronic voting machines could have been created by open, independent, organizations. But that wouldn't favor whatever party was in power, and wouldn't give any campaign donors any nice contracts. We can't have that, it'd be flat-out un-American!

    But the biggest trust problem was when the CEO of the biggest maker publicly said that they were "committed to helping Ohio deliver its electoral votes to the President." Who cares what the people vote!

    While the act of counting is fundamentally basic, the act of creating a ballot to favor your party is an art form right up there with redistricting.

    I think I need a stiff drink and a peer group.

    This is the first time I've ever seen why people don't trust electronic voting machines. Thanks for posting this!

    Kindest Regards, Rod Connect with me on LinkedIn.

  • To change the outcome of an election, you may only need to tweak the voting machines in Florida and Ohio. The Pentagon has ties to the voting machines. The Bush Dynasty has many family members in the Pentagon. Janet Reno, I believe, was the judge that chose to cover up the voting fraud...as a way to not destroy the faith of the people. Ask Al Gore if there is something curious about the voting machines.

    Why the government resists an auditing trail for voting reveals it all to me. It is not how many votes; it is who counts the votes that matters.

  • So many companies are going to fight tooth and claw to prevent their algorithms from being disclosed as they are trade secrets. And our lives are affected by this. I'm personally hit -- Experian conflated three traffic accidents that my dad had -- 500 miles from where I live in a car that I don't own -- with my driving record and now I can't change my insurance. It's not a huge deal since my insurance knows that it's not me who had the accidents and my rate didn't go up, but what else might this affect? And how did I find out? By trying to get insurance quotes and finding out that I couldn't. It's not like Experian was transparent and told me about what information they have on me.

    Would it be good if algorithms were public knowledge? Well, how many people use Linux because they can audit the code and make modifications? Not a lot. It takes experts to analyze these things and run them through simulations to see if predictable outputs can be generated.

    And then you have Expert Wars. My expert looks at your algorithm and says 'Red', your expert says 'Blue'. Who is correct? Who is going to pay to bring in a third expert and hope they verify your result and not your opponents, hoping that they don't say 'Green'?

    I think it's a big mess and that at a societal level that we can't do much to clean it up. Not unlike politics, there's too much money at stake. I can't begin to speculate how many or whose databases have been accumulating information on me, so how can I ask them what info they have on me and how they gather it, much less 'can I see your algorithm?'

    -----
    [font="Arial"]Knowledge is of two kinds. We know a subject ourselves or we know where we can find information upon it. --Samuel Johnson[/font]

  • Wayne West (3/1/2016)


    ...how many people use Linux because they can audit the code and make modifications? Not a lot...

    More choose Linux because they could audit the code and make modifications if they chose to do so than choose it because they intend to.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

Viewing 15 posts - 16 through 30 (of 53 total)

You must be logged in to reply to this topic. Login to reply