What is Computer Vision?

Computer Vision is a rapidly growing technology field that most people know little about. While Computer Vision has been around since the 1960s, it’s growth really exploded with the creation of the OpenCV library. This library provides the tools necessary for software engineers to create Computer Vision applications.

But what is Computer Vision? Computer Vision is a mix of hardware and software tools that are used to identify objects from a photo or camera input. One of the more well-known applications of Computer Vision is in self-driving cars. In a self-driving car, numerous cameras collect video inputs. The video streams are then examined to find objects such as road signs, people, stop lights, lines on the road, and other data that would be essential for safe driving.

However, this technology isn’t just available in self-driving cars. A vehicle I rented a few months ago was able to read speed limit signs as I passed by and display that information on the dash. Additionally, if I failed to signal a lane change, the car would beep when I got close to the line.

Another common place to find Computer Vision is in factory automation. In this setting, specialized programs may monitor products for defects, check the status of machinery for leaks or other problematic conditions, or monitor the actions of people to ensure safe machine operation. With these tools, companies can make better products more safely.

Computer Vision and Artificial Intelligence are also becoming more popular for medical applications. Images of MRI or X-Ray scans can be processed using Computer Vision and AI tools to identify cancerous tumors or other problematic health issues.

From a less practical view, Computer Vision tools are also used to modify videos of user content. This may include things such as adding a hat or making a funny face. Or, it may be used to identify faces in an image for tagging.

Ultimately, Computer Vision technologies are being found in more and more places each day and, when coupled with AI, will ultimately result in a far more technologically advanced world.

What is the Dark Web?

Most people have heard of the Dark Web in news stories or tech articles. But what is it? How does it work? Is it worth visiting?

The Dark Web is a hidden network of highly encrypted machines available over the internet, but not using a typical web browser. While any content can be stored on the Dark Web, the majority of the content is of a questionable nature – such as child pornography, snuff films, drug and fake ID stores, and similar content. One such marketplace, Dark Amazon, provides users an Amazon-like shopping experience.

However, not everything on the Dark Web is illegal or unethical. In fact, the Dark Web can be a very useful tool for individuals in China to access Facebook (they have a Dark Web site) or for intelligence operatives in Iran to contact the CIA (they’re on the Dark Web too).

In short, the Dark Web is a useful tool if you are trying to remain anonymous or operating within a country that has strict internet controls. But what do you need to get started? Simple – download a TOR browser. TOR stands for The Onion Router. The idea of an onion is that there are multiple layers of encryption and that the traffic is routed through numerous machines to prevent tracking. TOR browsers exist for all major platforms – including mobile – and are really no different than any other web browser.

The real challenge, however, is finding content. For that, you will need a Dark Web search engine – such as TOR66. However, a few suggestions before you give it a try. First, run a VPN. While there is nothing illegal with using a TOR browser, you may draw suspicion from your ISP and it is always possible that TOR users are being watched by law enforcement. Second, make sure you have antivirus up-to-date on your machine – you never know what’s out there. Third, trust no one. The Dark Web is a place of thieves, con artists, drug dealers, and other people with questionable ethics.

Productivity Gains through Aliases & Scripts

For users of Mac or Linux-based machines, aliases and scripts can create some of the most valuable tools for increased productivity. Even if you run a Windows machine, there is a strong possibility that some machines you interact with – such as AWS – utilize a Linux-based framework.

So, what are aliases and scripts? Scripts are files that contain a sequence of instructions needed to perform a complex procedure. I often create scripts to deploy software applications to development server or to execute complex software builds. Aliases are much shorter, single line commands that are typically placed in a system startup file such as the .bash_profile file on MacOS.

Below are some of the aliases I use. Since Mac doesn’t have an hd command (like Linux), I have aliased it to call hexdump -C. Additionally, Mac has no command for rot13 – a very old command to perform a Caesar cipher which I have aliased to use tr.

Since I spend a lot of time on the command prompt, I have crated a variety of aliases to shortcut directory navigation including a variety of up command to move me up the directory hierarchy (particularly useful in a large build structure) and a command to take me to the root folder of a git project.

Finally, I have a command to show me the last file created or downloaded. This can be particularly useful, for example, to view the last created file I can simply execute cat `lastfile`.

alias hd='hexdump -C'
alias df='df -h'
alias rot13="tr 'A-Za-z' 'N-ZA-Mn-za-m'"
alias up='cd ..'
alias up2='cd ../..'
alias up3='cd ../../..'
alias up4='cd ../../../..'
alias up5='cd ../../../../..'
alias up6='cd ../../../../../..'
alias root='cd `git rev-parse --show-toplevel`'
alias lastfile="ls -t | head -1"

One common script I use is bigdir. This script will show me the size of all folders in my current directory. This can help me locate folders taking up significant space on my computer.
#!/bin/bash

SAVEIFS=$IFS
IFS=$(echo -en "\n\b")

for file in `ls`
do
        if [ -d "$file" ]
        then
                du -hs "$file" 2> /dev/null
        fi
done
IFS=$SAVEIFS

Another script I use helps me find a text string within the files of a folder.
#!/bin/bash

SAVEIFS=$IFS
IFS=$(echo -en "\n\b")

if [ $# -ne 1 ]
then
        echo Call is: `basename $0` string
else
        for file in `find . -type f | cut -c3-`
        do
                count=`cat "$file" | grep -i $1 | wc -l`
                if [ $count -gt 0 ]
                then
                        echo "******"$file"******"
                        cat "$file" | grep -i $1
                fi
        done
fi
IFS=$SAVEIFS

These are just a few examples of ways to use scripts and aliases to improve your productivity. Do you have a favorite script or alias? Share it below!

Dangers of Artificial Intelligence

Risk/Reward

With the growth of artificial intelligence, one of the subjects we don’t discuss enough is the possible dangers that it may create. While AI may help us better drive our cars or provide a rapid, more accurate diagnosis of medical issues; it may also create problems for society. What are those problems and what should we do to minimize those risks?

Poorly Tested Code

As a software engineer, my biggest worry is that poor-quality code will be widely deployed in artificial intelligence systems. Look around today, and you will see what I mean. I use a Mac, and the current version of Safari is riddled with bugs. Indeed, nearly every application on my computer has several updates per year to address bugs.

This is caused, in part, by the demands of businesses. I have worked for many companies over the years who desired to push out a new version even when some known bugs existed. For the business, this is necessary to ensure they beat the competition to release new features. However, this acceptance of buggy software can be disastrous in the world of AI.

For example, what happens when the artificial intelligence system misdiagnoses cancer? For the individual, this could have life-altering effects. What about the self-driving car? Someone could be hit and killed.

How good is good enough for artificial intelligence? I don’t have an answer, but it is something businesses need to strongly consider as they dive deeper into the world of AI.

Deep Fakes

A growing concern for artificial intelligence is how it could be used by organizations or political entities to persuade consumers or voters. For example, a political adversary of the president could create a fake video of the president engaged in some behavior that would bring discredit upon him. How would the electorate know it is a fake? Even worse, how could our nation’s enemies use fake videos for propaganda purposes here or abroad?

Employment

In many ways, advances in artificial intelligence are very similar to the changes during the industrial revolution. As AI becomes more advanced, we can expect to see more and more jobs performed by intelligent robots or computer systems. While this will benefit businesses who can cut payroll, it will have a negative impact laborers who can easily be replaced.

What Should We Do?

This is just a very small list of potential issues. Indeed, numerous techies have discussed countless other risks we face as we adopt more AI-based systems. But what should we do? The value of AI to our lives will be profound, but we must start to consider how we will address these challenges from both a legal and a societal perspective.

For example, we may need to create laws regarding liability for AI systems to ensure that businesses provide adequate testing before deploying systems. But problems like deep fakes and employment aren’t as easy to fix. We can certainly provide training to individuals who are displaced by AI, but as more and more jobs are replaced, where will all the workers go?

I don’t have the answers. However, I think it is time for techies and non-techies alike to start asking the questions so that we can reap the benefits of improving artificial intelligence while mitigating the potential risks

Getting an IT Job Without a Degree

I frequently talk to high school students or young adults who are hoping to land a lucrative IT job without a degree. Unfortunately, few of these individuals have the skills necessary to get the job they want. While many high schools now offer an increasing number of computer courses, rarely do they provide the depth or breadth of skills required by employers. However, this does not mean you need a degree to work in IT. In fact, some of the best techies I know started their career without a degree.

If it is possible to get a job without a degree, how do you do it? First, it’s important to recognize that IT jobs are broadly divided into two groups – system management and software development. System management jobs involve the management of computer systems, networks, servers, and other computer hardware. Additionally, cybersecurity professionals fall into this category (although there is often some overlap with software development skills). Software development jobs include web developers, software engineers, mobile application developers, and a variety of other jobs focused on using computer code to create applications for users.

Information Technology Certifications

Typically, individuals with system management jobs have degrees in Information Technology Management. However, those without a degree can show their competence with a variety of tech certifications. Some of the most widely known certifications are from the Computing Technology Industry Association better known as CompTIA. This includes CompTIA’s most well known certification for desktop maintenance and support – A+ certification. However, CompTIA offers a variety of other entry-level certifications as well. Network+ certification shows competency with network management and Security+ demonstrates basic security knowledge.

In addition to CompTIA certifications, a variety of other organizations provide IT certifications such as Cisco’s CCNA, Amazon’s AWS Certified Solutions Architect, and Google’s Associate Cloud Engineer. These certifications – unlike those from CompTIA – are vendor specific. However, the skills these certifications demonstrate are highly valuable to businesses.

Software Development Projects

Software developers typically have a bachelor’s degree in Computer Science. And, while there are some certifications available for programmers, they are not as widely desired as those for IT management. As such, it’s more difficult to demonstrate programming skills to a potential employer. However, this can be overcome by providing sample code on GitHub or BitBucket. If you want a job as a developer, spend some time creating professional-quality software applications that demonstrate your knowledge. Then, ensure to include a link to your repository in your resume. While you learn to code, don’t neglect learning SQL, HTML, and JavaScript. During the last decade, these skills have become standard for nearly all software development jobs.

I’ve talked to many young men who would like to become game developers. For them, I would recommend you consider your background in math and physics first. While there are libraries that make game programming easier, it’s hard to get far without some knowledge of matrix manipulation, trigonometry, gravity, and other topics that require a solid background in math and science.

Conclusion

While most people enter the IT world with a bachelor’s degree, it is possible to find good jobs without a formal education. If you want to work in the system management field, focus on certifications. If you want to work in software, focus on projects you can demo to show your ability. While either of the above will require effort, there really are no shortcuts in the IT world. Furthermore, if you are expecting an employer to pay you the high salaries common to the IT world, your efforts will be well compensated.

Crypto Currency Problems

Even with the recent decline in cryptocurrency prices, enthusiasm remains high among blockchain supporters. However, after more than a decade, several key problems still remain before wide-spread adoption can be expected.

Investment / Currency Dilemma

The first problem is the investment/currency dilemma. Blockchain evangelists repeatedly tell us what an amazing investment crypto currencies are. Then, they tell about how crypto is replacing fiat currencies. Unfortunately, however, it’s not possible to be both an investment and a currency. Why? Because the two are – for the most part – mutually exclusive. Investments require volatility – something we see in abundance with crypto currencies. However, an actual currency requires stability. Nobody wants to be paid for work done this month at a wage that is wildly fluctuating. So, we need to decide which it is – a currency or an investment.

While some currencies – known as stablecoins – strive to maintain a 1-to-1 relationship with the dollar, this seems to fly in the face of the argument that fiat currencies should be replaced with crypto currencies. While these stablecoins may work great for purchasing goods and services, why not simply use the dollar instead and save yourself the transaction costs?

Energy Consumption

I have previously blogged about the crypto energy issue. In short, crypto currencies consume massive amounts of electricity while many around the globe are arguing that we need to reduce energy usage to prevent climate change. However, even if you reject climate change; it’s no secret that many places around the world suffer from energy shortages. Even in the US, brownouts are not uncommon in many communities on the hottest days of summer. Is it really reasonable to consume massive quantities of energy to create digital money?

Cyber Terrorism

Crypto has a long history of being used for money laundering, drugs, hacking, and other nefarious uses. While many will argue that this represents only a very small portion of the crypto market, it none-the-less is a real concern that the crypto community needs to address. This is particularly obvious with the growth of ransomware demanding payment in Bitcoin. Regardless of the actually percentage of illicit usage, it still reflects poorly on crypto currencies and will cause increasing oversight by government entities which could negatively impact the crypto markets and long-term viability of blockchain technologies.

Quantum Computing

Nobody seems to talk about it much, but quantum computing could unravel the entire blockchain in the blink of an eye. Since crypto currencies depend on encryption, it is absolutely essential that the encryption algorithms used be unhackable. Given the history of encryption protocols, that seems unlikely. However, it becomes even more unlikely when quantum computing enters the mainstream. While it may be years off, the introduction of a large quantum computer would allow the owner to rewrite the entire blockchain by simply having the majority of computing power on the internet – something not unreasonable with a modest quantum computer. This would rapidly shift financial power into the hands of a single individual.

Conclusion

While I continue to hear people talk about all the great things crypto currencies have to offer, few are interested in addressing the issues that will either prevent widespread adoption or create growing threats to commerce moving forward. If, indeed, nobody is interested in resolving these issues to support the long-term growth of blockchain technologies, then doesn’t it support the notion that this is really nothing more than a Ponzi scheme?

Basics of Artificial Intelligence – V

Up to this point, we have talked about some of the fundamental algorithms for artificial intelligence and how they can be implemented in Java. Java is a great language for speed and wide usage in the software world. However, Java is not the only choice for implementing artificial intelligence. In this post, we will examine three of the most popular languages for creating artificial intelligence solutions.

Java

Java is one of the most widely used computer programming languages available today. Since it’s development in the 90’s, Java has been widely used for web development as well as for creating cross-platform applications. Java runs in a virtual machine – the Java Virtual Machine (JVM). Any computer that has an implementation of the JVM can run a Java program. Additional languages have been developed that are comparable with the JVM as well including Scala, Groovy, and Kotlin. Java is object oriented, compiled, and strongly typed. Compiled languages are fast, but strongly typed languages can be problematic in artificial intelligence as data structures must be well defined or generics implemented which can complicate code.

R

R is a statistical programming language used more by statisticians than computer programmers. It is designed to deal with matrices of data, and as such is very well suited for AI development. Additionally, R has a multitude of packages that can easily create graphs and charts to help analyze data dependencies. However, where R is lacking is in ease of use. Additionally, R isn’t as well suited for deploying AI applications – but rather for research.

Python

Python has been around since the early 90’s. However, it’s mainstream use has only exploded during the last decade or so. Because of it’s simple syntax, Python has been widely embraced by people outside of the programming community – and in educational settings. Because of this, Python use has exploded for utilities, system administration tasks, automation, REST-based web services, and artificial intelligence. Furthermore, Python has excellent frameworks and tools for AI development. Of particular interest are Jupyter and SciKit Learn. These tools greatly simplify AI development, and allow developers to work on solving problems more quickly than Java and with substantially less setup and expertise.

MATLAB

While talking about AI languages, I must also mention MATLAB or, it’s open source alternative Octave. These platforms are incredibly popular in academic communities. However, MATLAB – and the associated toolkits – are expensive and far more difficult to use than Python. Additionally – like R – they don’t really create deployable solutions for customers. However, if you are a mathematician, you may find MATLAB more to your liking.

Conclusion

When I work on artificial intelligence code, I will often use R and Python. While I have been a Java developer for years, and have implemented various AI solutions using Java, I find it far more complicated than the alternatives. I often use R for analyzing correlation, creating charts, and performing statistical analysis of data using R Studio. Then, when it’s time to actually create the neural network, I will use Python and Jupyter.

If you prefer, AI frameworks are available – or can be created – for any other language. If you want the fastest solution, you may look into C libraries. If you want something that will run on a browser in a website, JavaScript may provide a better solution. In short, there are a variety of options for AI. However, for the novice, you’ll probably not find anything better than Python to get you started.

Basics of Artificial Intelligence – IV

Previously, we examined various functions that are used across a variety of artificial intelligence applications. Today, we’re looking at a specific algorithm. While not typically considered artificial intelligence, linear regression is the most basic means of allowing a computer to learn how to solve a problem. For linear regression, the user provides an array of input values as well as an array of expected output values. In algebra, these would be the x and y values of the equation respectively. Additionally, the user will need to provide a degree for the polynomial. This is the highest exponent for the x value in the equation. For example, a third degree polynomial would be ax^3 + bc^2 + cx + d.

Our first class will be the generic base class shared across all linear regression implementations. In this class, we define a method to calculate the score of a set of values as well as an abstract method to calculate the coefficients. NOTE: Referenced code is available for download from BitBucket.

import com.talixa.techlib.ai.general.Errors;
import com.talixa.techlib.math.Polynomial;

public abstract class PolyFinder {
  protected float[] input;
  protected float[] idealOutput;
  protected float[] actualOutput;
  protected float[] bestCoefficients;
  protected int degree;
	
  public PolyFinder(float[] input, float[] idealOutput, int degree) {
    this.input = input;
    this.idealOutput = idealOutput;
    this.actualOutput = new float[idealOutput.length];
    this.bestCoefficients = new float[degree+1];
    this.degree = degree;
  }

  public abstract float[] getCoefficients(int maxIterations);
	
  protected float calculateScore(float[] coefficients) {
    // iterate through all input values and calculate the output
    // based on the generated polynomials
    for(int i = 0; i < input.length; ++i) {
      actualOutput[i] = Polynomial.calculate(input[i], coefficients);
    }

    // return the error of this set of coefficients
    return Errors.sumOfSquares(idealOutput, actualOutput);
  }
}

Our next step is to create an actual implementation of code to get the coefficients. Multiple method are available, but we will look at the simplest – greedy random training. In greedy random training, the system will generate random values and keep the values with the lowest error score. It’s a trivial implementation and works well for low-order polynomials.

import java.util.Arrays;
import com.talixa.techlib.ai.prng.RandomLCG;

public class PolyGreedy extends PolyFinder {
  private float minX;
  private float maxX;
	
  public PolyGreedy(float[] trainingInput, float[] idealOutput, int degree, float minX, float maxX) {
    super(trainingInput, idealOutput, degree);
    this.minX = minX;
    this.maxX = maxX;
  }
	
  public float[] getCoefficients(int maxIterations) {
    // iterate through the coefficient generator maxIterations times
    for(int i = 0; i < maxIterations; ++i) {
      iterate();
    }
    // return a copy of the best coefficients found
    return Arrays.copyOf(bestCoefficients, bestCoefficients.length);
  }
	
  private void iterate() {
    // get score with current values
    float oldScore = calculateScore(bestCoefficients);
		
    // randomly determine new values
    float[] newCoefficients = new float[degree+1];
    for(int i = 0; i < (degree+1); ++i) {
      newCoefficients[i] = RandomLCG.getNextInt() % (maxX - minX) + minX;
    }
		
    // test score with new values
    float newScore = calculateScore(newCoefficients);
		
    // determine if better match
    if (newScore < oldScore) {
      bestCoefficients = newCoefficients;
    }
  }
}

With the greedy random training, we define the min and max values for the parameters and then iterate over and over selecting random values for the equation. Each time a new value is created, it is compared with the current best score. If this score is better, it becomes the new winner. This algorithm can be run thousands of times to quickly create a set of coefficients to solve the equation.

For many datasets, this can create a workable answer within a short time. However, linear regression works best less complicated datasets were some relationship between the x and y values is known to exist. In cases of multiple input values where the relationship between variables is less clear, other algorithms may provide a better answer.

Basics of Artificial Intelligence – III

Some artificial intelligence algorithms like input values to be normalized. This means that all data is presented within a predefined range, typically either 0 to 1 or -1 to 1. Normalization algorithms take an array of input values and return an array of normalized values.

Denormalization is the opposite process. In denormalization, an input array of normalized values is presented and the original values are returned. Denormalization is useful when the output value of an AI algorithm is normalized. Since the normalized value is not in an expected range, the user must denormalize to determine the real number.

A simple example of number normalization is the Celsius temperature scale. All temperatures where water exists as a liquid exist between the values of 0 and 100. To normalize the temperature for an AI algorithm, I could simply divide each input by 100 to create an array of numbers between 0 and 1. When the output value is .17, the user would denormalize by multiplying by 100 to get a value of 17 degrees.

Of course, most normalization is not this simple, so we use algorithms to do the work.

public static float[] normalizeData(final float[] inputVector, final float minVal, final float maxVal) {
	float[] normalizedData = new float[inputVector.length];
	float dataRange = maxVal - minVal;
	for(int i = 0; i < inputVector.length; ++i) {
		float d = inputVector[i] - minVal;
		float percent = d / dataRange;
		float dnorm = NORMALIZE_RANGE * percent;
		float norm = NORMALIZE_LOW_VALUE + dnorm;
		normalizedData[i] = norm;
	}
	return normalizedData;
}

Note that two constants are defined outside this function. The NORMALIZE_RANGE which is 2 when normalizing to the range of -1 to 1 and the NORMALIZE_RANGE is 1 if we are normalizing to a range of 0 to 1. Additionally, the NORMALIZE_LOW_VALUE is the low value for normalization, either -1 or 0.

In the above normalization function, the user provides an array of input values as well as a min and max value for normalization. Then, we create a new array to hold the normalized values. The code then iterates through each input value and creates the normalized value to add to the normalized data array to return to the user. The actual normalization takes the following steps:

  • subtract the minimum value from the input value
  • divide the output by the data range to determine a percentage
  • multiple the normalized range by the percent
  • Add the value to the normalized low value.

For a concrete example, consider normalizing degrees Fahrenheit. If we were to input an array of daily temperates, we might have [70, 75, 68]. For the normalization range, we would pick 32 and 212. Following the above steps for the first temperature:

  • 70 – 32 = 38
  • 38 / (212 – 32) = .21
  • 2 * .21 = .42
  • -1 + .42 = -.58

If we followed through with the other temperatures, we would end with an output array of [-.58, -.52, -.60]. To denormalize, the below denormalization function can be used. Note, you must use the same min and max values that you used in normalization or your denormalized output value will not be the same scale as your input values!

public static float[] denormalizeData(final float[] normalizedData, final float minVal, final float maxVal) {
	float[] denormalizedData = new float[normalizedData.length];
	float dataRange = maxVal - minVal;
	for(int i = 0; i < normalizedData.length; ++i) {
		float dist = normalizedData[i] - NORMALIZE_LOW_VALUE;
		float pct = dist / NORMALIZE_RANGE;
		float dnorm = pct * dataRange;
		denormalizedData[i] = dnorm + minVal;
	}
	return denormalizedData;
}

This is the most basic normalization function. Other options may be to use the reciprocal of a number (but this only works for number greater than 1 or less than -1) or to use a Z-score.

Basics of Artificial Intelligence – II

Last week, we talked about distance calculations for Artificial Intelligence. Once you’ve learned how to calculate distance, you need to learn how to calculate an overall error for your algorithm. There are three main algorithms for error calculation. Sum of Squares, Mean Squared, and Root Mean Squared. They are all relatively simple, but are key to any Machine Learning algorithm. As an AI algorithm iterates over data time and time again, it will try to find a better solution than the previous iteration. A lower error score indicates a better answer and progress toward the best solution.

The error algorithms are similar to the distance algorithms. However, distance measures how far apart two points are whereas error measures how far the AI output answers are from the expected answers. The three algorithms below show how each error is calculated. Note that each one builds on the one before it. The sum of squares error is – as the name suggests – a summation of the square of the errors of each answer. Note that as the number of answers increases, the sum of squares value will too. Thus, to compare errors with different numbers of values, we need to divide by the number of items to get the mean squared error. Finally, if you want to have a number in a similar range to the original answer, you need to take the square root of the mean squared error.

public static float sumOfSquares(final float[] expected, final float[] actual) {
	float sum = 0;
	for(int i = 0; i < expected.length; ++i) {
		sum += Math.pow(expected[i] - actual[i], 2);
	}
	return sum;
}
	
public static float meanSquared(final float[] expected, final float[] actual) {
	return sumOfSquares(expected, actual)/expected.length;
}
	

public static float rootMeanSquared(final float[] expected, final float[] actual) {
	return (float)Math.sqrt(meanSquared(expected,actual));
}