Basics of Artificial Intelligence – III

Some artificial intelligence algorithms like input values to be normalized. This means that all data is presented within a predefined range, typically either 0 to 1 or -1 to 1. Normalization algorithms take an array of input values and return an array of normalized values.

Denormalization is the opposite process. In denormalization, an input array of normalized values is presented and the original values are returned. Denormalization is useful when the output value of an AI algorithm is normalized. Since the normalized value is not in an expected range, the user must denormalize to determine the real number.

A simple example of number normalization is the Celsius temperature scale. All temperatures where water exists as a liquid exist between the values of 0 and 100. To normalize the temperature for an AI algorithm, I could simply divide each input by 100 to create an array of numbers between 0 and 1. When the output value is .17, the user would denormalize by multiplying by 100 to get a value of 17 degrees.

Of course, most normalization is not this simple, so we use algorithms to do the work.

public static float[] normalizeData(final float[] inputVector, final float minVal, final float maxVal) {
	float[] normalizedData = new float[inputVector.length];
	float dataRange = maxVal - minVal;
	for(int i = 0; i < inputVector.length; ++i) {
		float d = inputVector[i] - minVal;
		float percent = d / dataRange;
		float dnorm = NORMALIZE_RANGE * percent;
		float norm = NORMALIZE_LOW_VALUE + dnorm;
		normalizedData[i] = norm;
	}
	return normalizedData;
}

Note that two constants are defined outside this function. The NORMALIZE_RANGE which is 2 when normalizing to the range of -1 to 1 and the NORMALIZE_RANGE is 1 if we are normalizing to a range of 0 to 1. Additionally, the NORMALIZE_LOW_VALUE is the low value for normalization, either -1 or 0.

In the above normalization function, the user provides an array of input values as well as a min and max value for normalization. Then, we create a new array to hold the normalized values. The code then iterates through each input value and creates the normalized value to add to the normalized data array to return to the user. The actual normalization takes the following steps:

  • subtract the minimum value from the input value
  • divide the output by the data range to determine a percentage
  • multiple the normalized range by the percent
  • Add the value to the normalized low value.

For a concrete example, consider normalizing degrees Fahrenheit. If we were to input an array of daily temperates, we might have [70, 75, 68]. For the normalization range, we would pick 32 and 212. Following the above steps for the first temperature:

  • 70 – 32 = 38
  • 38 / (212 – 32) = .21
  • 2 * .21 = .42
  • -1 + .42 = -.58

If we followed through with the other temperatures, we would end with an output array of [-.58, -.52, -.60]. To denormalize, the below denormalization function can be used. Note, you must use the same min and max values that you used in normalization or your denormalized output value will not be the same scale as your input values!

public static float[] denormalizeData(final float[] normalizedData, final float minVal, final float maxVal) {
	float[] denormalizedData = new float[normalizedData.length];
	float dataRange = maxVal - minVal;
	for(int i = 0; i < normalizedData.length; ++i) {
		float dist = normalizedData[i] - NORMALIZE_LOW_VALUE;
		float pct = dist / NORMALIZE_RANGE;
		float dnorm = pct * dataRange;
		denormalizedData[i] = dnorm + minVal;
	}
	return denormalizedData;
}

This is the most basic normalization function. Other options may be to use the reciprocal of a number (but this only works for number greater than 1 or less than -1) or to use a Z-score.

Leave a Reply