Monday, August 16, 2010

Using the sieve to factor numbers

Previously, we saw an overly simple implementation of the Sieve of Eratosthenes. Without any optimizations, that code is pretty much only good for puzzles and learning. Even for puzzles, it may be too slow. Some of the puzzle problems out there require code to find the prime factorization of numbers. For small numbers, trial division might be ok. Often, the problems require the factorization for every number less than one million. Then... it gets slow. Using the same idea as the Sieve, we can prepare an array that contains a prime factor for every number, or one if the number is prime. Then, to get the full factorization of the number, we can walk the array. Here is the code to create the array:
```def divisors(N):
"""return the biggest prime divisor for numbers less than N"""
result = [1 for x in xrange(N)]
for d in xrange(2,N):
if result[d] > 1:  continue
for j in xrange(d, N, d):
result[j] = d
return result
```
Then, to find the factorization of a number, we can look it up in the array to find a factor, then look up (n/p) in the array to find the next factor, and keep going until we are down to a prime number.
```divs = divisors(2000)

def factor(n, divisors=divs):
result = []
while n > 1:
p = divisors[n]
if p == 1:
result.append(n)
return result
while n % p == 0:
result.append(p)
n /= p
return result

print 1000, factor(1000)
print 24, factor(24)
print 17*31, factor(17*31)

```
Often, the factorization is used for things like computer Euler's totient function, and then we don't care to return the factors, just the result of using them. I used ideone.com to work on this code: Ideone.com | RLRZt