Things I Learned While Trying To Solve Euler Project Problem #3

Euler Project problem 3 took me way longer than I expected. While I am not a real programmer, I thought I was good enough to script this in bash quickly. My life as an IT guy requires me to have some knowledge of scripting and I thought I was good enough to pound this out quickly. It took some time. In the end, I realized that there was a simple and elegant way to get the correct result. If you want to just skip to it, here it is: https://github.com/leonowski/euler-practice/blob/master/euler3.sh

My first approach was really like a brute force method. I made a function to check if something was a prime number and made another function to get the square root of a number (I checked if numbers 2 through the square root were evenly divisible). I then made an array of all numbers evenly divisible into the input number. It looked like this:

#lets make an array to get divisibles first
array=()
for x in $(eval echo "{1..$1}")
do
  if [[ $1%$x -eq 0 ]];
    then
    array+=("$x")
  fi
done

After I got that array, I would then pull out only the primes into a new array. I would use the function I made previously to verify if the number was prime. It looked like this:

#now lets filter out the divisible array and keep only primes
primearray=()
for z in "${array[@]}";
  do
    if [[ $(primecheck "$z") -eq 1 ]];
    then
      primearray+=("$z")
    fi
  done

Finally, I would just print out the last number in the new prime only array like this:

printf '%s\n' "${primearray[-1]}"

And that did work. But it was freakin slow. On top of that, it wouldn't take the input of the number that the Euler problem wanted me to use:600851475143.

I scratched my head during my free time. My next idea was to abandon the use of arrays. Why not just loop backwards from the input to 1 and stop when we find even divisibility AND have a prime number?! So, I dropped the whole array approach and just made 1 for loop like this:

for x in $(seq "$1" -1 1)
 do
   if [[ $1%$x -eq 0 && $(primecheck $x) -eq 1 ]];
   then
   printf '%s\n' "$x"
   break 2
  fi
done

Technically, this was better. I didn't have to go through a set of numbers twice. I was able to calculate prime factors a little faster, but it still wasn't that fast and I still couldn't use the actual number the problem wanted me to use. AARGH!

My next thought was to optimize the loop by immediately skipping the primecheck() function on numbers that were obviously not prime - numbers ending in 0,2,4,6, and 8. So, I added this check before the mod 0 and primecheck:

for x in $(seq "$1" -1 1)
 do
   lastdigit="${x: -1}"
   if [[ "$lastdigit" -eq "0" ]] || [[ "$lastdigit" -eq "2" ]] || [[ "$lastdigit" -eq "4" ]] || [[ "$lastdigit" -eq "6" ]] || [[ "$lastdigit" -eq "8" ]]; then
   continue

   elif [[ $1%$x -eq 0 && $(primecheck $x) -eq 1 ]]; then
   printf '%s\n' "$x"
   break 
  fi
done

This also gave a speedup again. But - you guessed it, still no luck with:600851475143.

I battled some flu symptoms while thinking about this in my head. I then thought about trying to figure out prime factors by hand. I had actually forgotten how to do this by hand - it's actually really simple. Starting at 2, see if you can evenly divide the input number by 2. If you can, great - write it down. Try 2 again against the answer of the previous try (the input divided by 2). If it still works, write it down again. Try 2 again against the answer of the previous try (the input divided by 2 divided by 2) - if it doesn't work, try 3 against the answer of the previous try. If 3 works, write that down. If it doesn't, continue to 4. Continue this until all the numbers you have written down multiplied by each other equal your input number.

Some examples:

Input: 12345
Factors: 3 x 5 x 823

Input: 123456
Factors: 2 x 2 x 2 x 2 x 2 x 2 x 3 x 643

Input: 22
Factors: 2 x 11

Input: 123
Factors: 3 x 41

A pattern began to show. The last number is always a prime! While in my medicated/woozy state, I wrote down my thought on a new loop (using arrays again) to get what I needed. It turns out being away from my computer really helped. Doing things by hand can really make things simple for the brain.

notes

So, the script simply looked like this (no more prime checks):

factors=()
current=$1
for x in $(seq 2 1 "$1")
do
  if [[ $current%$x -eq 0 ]]; then
  factors+=("$x")
  current="$current"/"$x"
  x=$x
  fi
done

printf '%s\n' "${factors[-1]}"

This seemed to work really well! I was like "holy crap!" - but there was still a problem. I couldn't use the input number that the Euler project wanted me to use:600851475143. I was REALLY starting to hate that number. In fact large 6+ digit numbers all seemed slow.

After running through this in my head, I realized that the entire loop had to be run from 2 through the input number. So, the bigger the input number, the longer it would take. It shouldn't need to keep going once you have all the factors in place to equal the input number. So, I added something in the loop to keep track of it:

mul=1
  for i in "${factors[@]}"; do
    mul=$(( mul * i ))
    done
#    echo $mul #uncomment to see progress of check done before break
    if [[ $mul -eq $1 ]]; then
    break
  fi

What this did was print out the current state of the array if you multiplied all the values. Once it got to the input number, the loop could stop. Done - you have your prime at the end of the array. This improvement also worked well. I even had something in there to print the progress of the array multiplication. Things were now SUPER FAST! But - GUESS WHAT AGAIN!? I still couldn't use the 600851475143 number. I was almost ready to scream.

I tried to run it again with 600851475143 and realized that my "echo $mul" wasn't even showing any progress. Surely, it would have hit 1 number at some point that was evenly divisible. I suspected that the number 600851475143 was just way too big for what I was running this on and the for loop would never run on that big number (I was running this in Cygwin64 on Windows). Maybe this isn't a problem in a real linux environment? I then started experimenting with the main loop. I realized that if I changed the upper range of the for loop to a hard number (instead of the variable input like 600851475143), I could get my answer for 600851475143! The answer was 6857, so if I put 7000 as the last number in the loop, things worked! So, finally, I thought about doing a loop that didn't depend on a definite number range. The answer was to simply use a while loop! I could initialize the counter at 2 and count up indefinitely without specifying the upper number! So, the final work looked like this:

factors=()
current=$1
x=2
while [[ $x -gt 1 ]]
do
  if [[ $current%$x -eq 0 ]]; then
  factors+=("$x")
  current="$current"/"$x"
  x=$x
  mul=1
  for i in "${factors[@]}"; do
    mul=$(( mul * i ))
    done
#    echo $mul #uncomment to see progress of check done for break
    if [[ $mul -eq $1 ]]; then
    break
    fi
  fi
x=$((x+1))
done


#largest prime factor should be last element in the factors array
printf '%s\n' "${factors[-1]}"

YES! That worked! No more hard coding a for loop range! I'm sure this problem would be easy for a seasoned programmer, but it was actually fun for me to go through this and struggle.

UPDATE

OK, while the above did work for the project euler number, the script seemed to be broken for some other numbers. I took a closer look and I was incrementing the loop at the wrong time. I also made the break just print it out right there. Here is the final best version (I promise!):

factors=()
current=$1
x=2
while [[ $x -gt 1 ]]
do
  if [[ $current%$x -eq 0 ]]; then
  factors+=("$x")
  current="$current"/"$x"
  x=$x
  mul=1
  for i in "${factors[@]}"; do
    mul=$(( mul * i ))
  done
#    echo $mul #uncomment to see progress of check done for break
    if [[ $mul -eq $1 ]]; then
    printf '%s\n' "${factors[-1]}"
    exit
    fi
  continue
  fi
x=$((x+1))
done