Machine Learning and Human Self-awareness

With all the talk around machine learning, it causes us to reflect how humans learn. What are the parallels between humans and machines? What can machine learning teach us about our experiences and the actions we take based on our experiences?

ML is a way to provide meaningful experiences to machines. 

Photo Credit: Alan Levine

We convey information to silicone based entities in a language they understand, “When this happens, this other thing tends to happen.” Or, getting slightly more complicated, “When these four things happen, with some of those things being more significant than others, this other thing has a very big chance of happening.”

What makes machine learning different from run-of-the-mill statistics is that we tend to care less about the process or even the veracity of the data. The outcome is all that matters. If a machine is able to experience enough scenarios and outcomes, there is a fairly good chance it can provide us with a prediction. 

If machines learn by experiencing data, there is theoretically no limit to what they can learn. Data is the limit. A machine needs enough of the right kind of data for its predictions or insights to be meaningful. 

Humans learn through experience as well, but the sheer number of datapoints processed through their five senses is astronomical. Think of going for a walk. Every forward leg movement is a vicious, light-speed cycle of inputs and their resulting outputs. Not only are we learning as well walk, but we’re taking into account years of walking/learning experiences. We have multiple models going on at once.

There isn’t a single action humans take that isn’t informed by nearly infinite numbers of data points. Human decisions are the result of a form of ‘supervised learning’. We act, experience, and choose to act again based on an aggregation of results or outcomes. And to add to the permutations of parallelism, we’re impacted by external models (other humans).

What are the experiences and training we’re providing each other? How does abuse impact a person expectations of outcomes? How does this impact the actions they take in the future? How does poverty impact the ‘supervised learning’ that humans experience? When a person lands in jail, how did they get there? What models are society using to put them there? When someone does something to contribute positively to society, how do we create responses that affirm these actions and stimulate more of them?

The more we explore machine learning, the more we’ll learn about ourselves. My hope is that this will provide us with a level of enlightenment and self-awareness that we’ve not seen before.

Postman API Learning, Testing, and Development

I’m pretty late into to the API game. Recently I was on a call with a handful of security engineers and they explained that they couldn’t afford to have their people staring at console screens any more. Instead, they rely almost entirely on API’s to automate and streamline their work. I’ve been hearing about API development forever but I’d not gotten past the first hurdle: how to start. My answer to this is Postman.

Once you have an API you want to consume, you can start doing ‘POST’ and ‘GET’ requests pronto and see results immediately. Also, one critical tipping point for me was when I watched a number of the introductory videos that Postman provides. For example, I didn’t understand what the ‘Test’ section was for. The videos demonstrated that this is where you can write JavaScript to traverse the JSON files which are the results of your requests.

Currently, I’m only using a free account. I’m in learning mode, but as I move toward doing more work with API’s in the future, I’ll absolutely be using Postman to test and verify my efforts. It’s also a great introduction in the security advantages and disadvantages of using API’s.

Anyone else who has a desire to dig into API’s and consider what they can do to add value to your work, try Postman. And don’t forget to check out a few of their tutorial videos.

Health Care Pricing: Can big data help us here?

This morning I read an article in the Economist magazine January 12, 2019 edition titled, “Shopping for a Caesarean”. This article summarizes the challenges that we face in the US around pricing for medical procedures. The true cost of medical procedures is lost in reams of arbitrary pricing algorithms.

In an era of “big data” convoluted pricing presents a great irony. We have data that corresponds to nearly every other facet of our lives. This data helps businesses predict consumer behavior in order to market the right product to consumers at the right time.

In the health care industry, hospitals don’t have to predict consumer needs. Rather, consumers will purchase a procedure when they are sick and/or under “duress” (the word used in the Economist article). They aren’t likely to shop around. This “duress” allows hospitals to use creative pricing, make deals with insurers, and do all sorts of tricks that conceal the true cost of healthcare.

The Economist article argues that price transparency is the first step, but that it won’t solve the problem because of the “duress” faced by those in need of care. What is needed is a big picture look at pricing for all of us to see when we are not in duress. This way we can identify who exactly is benefiting from these gross inefficiencies. We need “big data” for the masses. We need “big data” that will improve the standard of living for average folks just like we have “big data” that helps businesses market products. However, as long as the medical industry profits greatly from hidden pricing algorithms, they have little incentive to share their secrets and drive more efficiency into the marketplace.

Originally, this lack of transparency was probably not intentional, but now that it generates so much profit for the healthcare industry there is very little incentive to do anything about it. We need more than transparency around pricing for each procedure; we need “big data” algorithms that will allow us to untangle our current pricing mess.

Amazon Athena – What?

If you’re like many IT professionals who’ve had anything to do with large amounts of data, you’ve become immune to the phrase ‘big data’. Mostly because the meaning behind that phrase can vary so wildly.

Processing ‘big data’ can seem out of reach for many organizations. Either because of the costs in infrastructure required to establish a foothold on this front or because of a lack organizational expertise. And since the meaning of ‘big data’ can vary so much, you may find that you’re doing ‘big data’ work and then ask yourself, “Is this big data?” Or an observer can suggest that something is ‘big data’ when you know full well that it isn’t.

With my own background in data, I’m ever curious about what’s out there that can make the threshold into ‘big data’ seem less insurmountable.  Also, I’m interested in the security considerations around these solutions.

In the last week or so, I’ve gotten more familiar with AWS s3 buckets and a querying service called Amazon Athena. Here’s the truly amazing thing. You can simply drop files in an s3 bucket and query them straight from Amazon Athena. (There are just a couple steps to go through, but they are mostly trivial.) And for the most part, there’s not much of a limit for how much data you can query and analyze. You can scan 1tb of data for $5. What? That’s right. And you didn’t have to set up servers, database platforms, or any of that. I’ll be exploring Amazon Athena more and more over the coming weeks. If you have an interest in this sort of thing, I suggest you do the same.

One note: Google has something similar called BigQuery, so that might be worth a look as well. I’ve explored BigQuery briefly but I keep coming back to various AWS services since they seem to be holding strong as a dominant leader in emerging cloud technologies. But as well all know, the emerging technology landscape can change very quickly!