 |  |  |  | | | Nice. BTW, the algorithm is called 'sieve of eratosthenes' :) |
|  | | an obvious optimization..
for(i=2;i<MAXX;i++)
can be replaced with
for(i=2;i<sqrt(MAXX);i++)
but to save a lot of memory(at minimal speed cost)
int primes=0;
for(i=2;i<=sqrt(MAXX);i++)
{
nos[primes]=i;
for(j=0;j<primes;j++)
if((i % nos[j])==0) break;
if(j==primes)primes++;
}
this uses the same algorithm(but in reverse): but only the primes are stored: thus drasticly reducing the memory required, and also allows you to print out the primes as you go(don't need to wait until you're doen computing) |
|  | | | try to save the flags in bits, you will get larger storage :D |
|  | | The algorithm given by anonymous although efficient does not solve my purpose which is to find *all* primes below a certain limit(In this case its only below the sqrt of the MAXX). If I wanted to just test the primality of a number this method is too inefficient. I don't really think this method resembles the 'sieve' algorithm. Plus if you want you can get the prime output as soon as you compute it. Once you have a list of primes by this method you can use it for primality tests of huge numbers.
for(i=2;i<MAXX;i++)
{
if(nos[i]==0) //If the number is not prime i.e crossed out
continue;
//insert printf here because we know number is prime
printf("%d\r\n",i)
for(j=2*i;j<MAXX;j+=i)
nos[j]=0; //If it is prime cut from list all its multiples
}
I initially did think about using bits instead of ints but that would require some bitwise operations which although efficient still consume CPU. But atleast I can use char.
--ne0 |
|  | | | Hmmm, I strongly doubt that the sieve of eratosthenes is close to the fastest method to compute primes. I have red loads of articles about and that's not the direction they take. The sieve of eratosthenes in fact is the first method that usually is taught in schools and it's the easiest one. In fact it's probably the first method ever used to compute primes. |
|  | | |  | | <- the first anon post:
actually ronald: that WOULD solve for them all up to MAXX(if you made the change in my), because any integers larger then sqrt(MAXX) and below MAXX which are not divisible by a number between 1 and sqrt(MAXX) must be prime
but you are correct about the code I made for a memory optimization: that it doesn't loop far enough: it should actually be...
for(i=2;i<MAXX;i++)
{
nos[primes]=i;
for(j=0;(j<primes) && (nos[j]<= sqrt(i));j++)
if((i % nos[j])==0) break;
if(j==primes) || (nos[j]> sqrt(i))
primes++;
}
|
|  | | | the Sieve of eratosthenes is by far not the fastest and its very proc intensive, thats why it used (my still) to be used in benchmark software its far from efficient |
|  | | | A memory optimize would consider using a boolean array and use the index numbers as the prime checks, that way you don't have huge overhead |
|  | | I would use a double linked list with the nodes kept in an array for fast acces through the number. 1 todo list (double list for erasing when accesed through the array) that contains all numbers that might still be prime, 1 primes list that contains all found primes and 1 erased list that contains the numbers we erased from the todo list, it might be best to start with a dummy node containing the value 1.
while todo is not empty:
-the first item on the 'todo' list is a prime.
-erase that item from the todo list.
-put in the primes list (using 1 of the 2 links)
-traverse through the erased list, for every item:
--multiply it's value with the prime
--if it is greater than MAXX, erase the current node being traversed from the erased list because the next prime will only be bigger. continue traversing.
--else acces that item through the array and erase it from the todo list.
--add it to the back of the erased list.
(using a dummy node 1 the prime itself is also added to the list)
the original algo tries to take the same item a lot of times of the list of possible primes. This algo does it only 1 time. |
|  | | |  | | the problem with that method janus: is it requires even MORE memory, and that is also not O(N), since you have a loop of loops. and it also seems that the code would detect that 4 is prime?
eg.
loop 1:
2 is prime: add it to erase list
loop2:
3is prime: add it to list, and multiply 2*3 to remove 6 from the todo list
loop3:
4 is prime? then remove 3*4 and 6*4 from the list?...
it would seem to detect 4,8,9,10,15, and many other non primes? |
|  | | loop 1:
2 is prime.
start traversing the erase list (contains only '1')
1*2 = 2 => add 2 to the list
next item is 2 2*2 = 4 => add 4 to the list
.....
And you are right about using extra memory: 2 links and a value field per number, so about 3 times as much.
and loops of loops can still be O(N): each number is only calculated once and taken off the todo list. And the moment a number on the erased list is multiplied by a prime with a result greater or equal to N, it is also taken of that list.
The first gives that a maximum of N valid calculations (result lower than N) take place, The second that the maximum of invalid calculations is again N. The inner loops is traversed under 2*N times. |
|  | | removing from the list isn't o(1) though, it's o(n), it's only o(1) if you only ever remove the first index(or any other fixed/maximal offset)
I still don't understand the algorithm though..
loop1:
2is prime, and 1,2,4 are on the list, while 4 is removed from 'todo'?
then loop2: 3 is prime, and 1,2,3,4,6,9,12 are on the list while 6,9,12 is removed from todo?
how is 8 ever removed?
and what is this magicl "N" that you are using as the maximum? the only "N" that made sense to me, was the maximum number of index's, which would still leave your lists being of length N: thus the total is O(N^2) not O(n)?
or is it recursive: that rather then 1,2,4 being removed on first pass: that 1,2,4,8,16,32... is removed?
then(because of the cost of traversing linked lists) it would be O(n^2 *log(n)) rather then even O(n^2)
the solution I posted, doesn't have the extra overhead of a linked list, is O(n root(n)), and also like yours doesn't require calculating the same values multiple times.
clearly however: the true value/importance in the speed of calculating primes, needs to also take into account the magnitude of the numbers being used. eg. when working with 500digit numbers, you can't just store it's value in an 'int' and use the division/multiplication opperators.. and multiplying 2 500digit numbers in itself becomes O(N log(n)) rather then the often assumed O(1). and at the higher range of numbers, it becomes clearly non-feasible to use any of these methods(aka, factoring) and instead lean on statistical probability.
eg. for teh number 1000000001
if we examine the series 11,101,1001... we can see a pattern emerging.
11 = prime
101=prime
1001 = 7*11*13
10001=prime
100001=prime
1000001=101*9901
and with almost no math(and obviously better then O(n^2), we can predict the number is probably not prime.. more so, with some more math we can predict what some of the divsors OF this number are likely to be, and can devise a mathematical simple function to represent this series of numbers.
ie. 10^x +1 is probably not prime if X is a multiple of 3: and is probably prime if X is not a multiple of 3.
in addition to stastics/probability and patterns like this..
for brute forcing large numbers, much of the math can be simplified.
eg. for the number 1526600120149
rather then looping up from 1 to 1235556
what we can do is look at 1235555*1235557
ie. 1526600120149 = 1235555*1235557 with a remainder of 1491014(which is clearly not a multiple of 1235555)
now with that math done: we can VERY quickly calculate the remainder of 1526600120149 vs 1235553*1235559 +x
1235555*1235557 +1491014 = (1235555-2)*(1235557+2) + x
a little simplification and we get: 123555*123557 + 2*(1235555-1235557) -2*2 +x
so now we can ignore the 123555*123557 term on both sides, and we get:
1491014 = 2*(-2) - 2*2 +x
so x=1491022
thus 1526600120149=1235553 * 1235559 with a remainder of 1491022(obviously not a multiple of 1235553)
the same math again will show
1526600120149=1491014
1235551 * 1235561 + 1491046
1235549 * 1235563 + 1491074
1235547 * 1235565 + 1491110
...x*x+x*4 = 980096-2x
(-6 +-sqrt(36+4*980096))/2
rather then checking each of these: we can approximate how far we would need to jump, to get to the point where the remainder is closest to 2* our divisor
eg.(1235553-x) * (1235557+x) + aprox. 1235553*2
because if it were to be exact: it would mean that
(1235553-x) * (1235557+x+2) gives a remainder of 0: and thus is a factor
and we get that X is aprox 987.00252524930460401494794895445
so lets guess x=986 to start(no point in using 987 and factoring even numbers), and then do some more math
so:
(1235555-986) * (1235557+986+2) gives a remainder of -3964,
since our remainder is negative: we know that we pastthe delta of 2 so we need to check the previous couple numbers to make sure we didn't miss a factor
so we re-calculate our formula and numbers to get
(1234565) * (1236549) gives a remainder of -3964
and (1234565-x) * (1236549+x) gives -3964 +x*x+(1234565-1236549)*x
= -3964 + xx -1984x
so lets check x=-2 and x=-4
-2 gives -3964 + (-2)*(-2) - 1984*(-2) = 0..
so this means that
1526600120149 = 1234567*1236547 with no remainder.
and programatically: we arrived at this with 2sqrt's, and a couple multiplys, divides and adds, but in that process, we confirmed that 1526600120149 is not a multiple of any integer between 1234567 and 1236547(989 different odd numbers), thus making it approximately 500 times faster then the above methods for numbers with 12digits, but for smaller numbers, this method would not perform significantly faster |
|  | | 1. removing is O(1) because you use the array to access the nodes and the double linked todo list allows you to remove a node without knowing where in the list it fits, you just need a pointer.
2. The algo gives all primes under N. The todo list starts with a little under N items and shrinks. The erased list grows and shrinks dynamicly.
3. in the first pass, every number under N that only has prime factors of 2 is removed from the todo list. We find 1, add 2*1, find 2, add 2*2, find 4, add 2*4, find 8, ...
in the second pass, every number under N that only has prime factors 2 and 3 is removed. All those that only have factor 2 are already on the erased list, so we multiply these, together with 1, with 3.
4. each number is calculated 1 or 0 (in the case of a prime) times => less than N calculations resulting in a number being erased from the todo list and put in the erased list and less than N numbers ever get on the erased list. If the multiplication of a number on the erased list with the current prime has a result greater or equal to N, we take the number off the erased list because the next prime will only be bigger. Each number can only be erased 1 time from this list so less than N calculations resulting in a number being erased from the erased list. Combined, the total number of calculations is less than 2*N, O(N).
|
|  | | so you start with an array of doubly linked list: which contains all the numbers 2 - N, and another array doubly linked list which contains only 1?..
eg.
struct{
int value;
linked *prev;
linked *next
}linked;
linked todo[1000];
linked *erased,*erasedTail;
for(a=0 to 1000)
{
todo[a].value=a;
todo[a].prev=&(todo[a-1]);
todo[a].next=&(todo[a+1]);
}
todo[0].next=&(todo[2]);
todo[2].prev=&(todo[0]);
erased=&(todo[1]);
erasedTail=&(todo[1]);
erased->next=erased->prev = null;
//then
while(todo[0].next)
{
int remove=todo[0].next->value
printf("%d, ",remove);
linked *eraser=&erased;
while(eraser)
{
int nextvalue=remove*eraser->value;
if(nextvalue<1000)
{//starting with the linked list sorted in memory is the major part I was overlooking
todo[nextvalue].next->prev=todo[nextvalue].prev;
todo[nextvalue].prev->next=todo[nextvalue].next;
erasedTail->next=&(todo[nextvalue]);
erasedTail=erasedTail->next;
erasedTail->next=null;
}else{//this else is the other part I overlooked
eraser->prev->next=eraser->next;
}
eraser=eraser->next;
}
//
}
so ya: it's O(n), but is defiantly not memory(or cache) friendly :)
I love tricks like memory sorted lists though hehe |
|  | | I think you have the idea :p. Just a few points:
todo[N-1].next=0;
with N=1000 in your case
if(todo[nextvalue].next!=0) todo[nextvalue].next->prev=todo[nextvalue].prev;[\code]
makes sure we don't try getting members from a null pointer.
I also included a linked* previous so that we can erase stuff from the erased list (it's one link only, so we cannot use eraser->prev->next=eraser->next;)
rules are:
nextvalue<1000 : add previous = eraser; at the end of the block
else case becomes:
[code]previous->next=eraser->next; //one link used
if(eraser==erasedTail) erasedTail=previous; //danger: bad case
the second line makes sure we don't lose our tail
|
|  | | This is yet another (and quite optimized) version of the Eratosthenes' sieve:
#include <stdio.h>
#include <memory.h>
// we're checking numbers 0..2*upto; 'uptoSqrt2' is sqrt(upto/2)
enum { upto=18000000, uptoSqrt2=3000 };
// 'is' is a bit-array; bit with value 1 means the number is prime
// the array contains only info for odd numbers
// example: is[0]==(MSB)01101110(LSB) means 1 (LSB) isn't prime; 3,5,7 is, 9 isn't, 11,13 is, 15 isn't
unsigned char is[upto/8+2];
void generate() {
unsigned i,j;
memset(is,0xff,sizeof(is));
is[0]=254; // 1 is not prime => first bit is set to 0
for (i=1; i<uptoSqrt2; ++i) { // checks whether 2*i+1 is prime
if (is[i>>3]&(1<<(i&7))) { // if it is => zero all its multiples
for (j=2*i*(i+1); j<upto; j+=2*i+1) is[j>>3]&=~(1<<(j&7));
}
}
}
int isPrime(unsigned a) { // works for a<=2*upto
if (a<=2 || !(a&1)) return a==2;
return !!(is[a>>4] & (1 << ((a >> 1) & 7)));
} |
|  | | To clearly understand the Sieve of Eratosthenes please do this:
Write numbers from 1 to 100 inclusive on a paper, try 10 numbers at a line.
Cross out 1, don't cross out its multiples since it would be every number.
Now cross out every non crossed out numbers one by one with crossing out its multiples upto the end of the list.
When you've come to 10, you will see that all remnants are primes 10 <= sqrt(100) but why sqrt(MAXX)?
You can divide MAXX to sqrt(MAXX) and you will get sqrt(MAXX). If you divide MAXX to any number greater than sqrt(MAXX) then you will get a number less than sqrt(MAXX), which you've already processed while coming to sqrt(MAXX). Thus, checking numbers upto sqrt(MAXX) is enough to complete the sieve of eratosthenes. If you do it from 1 to 100 you(anyone) will clearly understand it.
Upto 16 e.g.:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
X X 3 X 5 X 7 X 9 X 11 X 13 X 15 X
p: 2
X X X X 5 X 7 X X X 11 X 13 X X X
p: 2 3
since 5 > sqrt(16) = 4 all remnants are prime. Easily verifiable for MAXX = 16.
I was going to write that I think the fastest method to compute primes is GNFS google and wikipedia on it ;) but I've checked and it does not list but factor big integers. |
|  | | If you want to generate a list of primes, you can speed up your method by only checking numbers that are good candidates for being prime.
You know that even numbers after 2 are not prime. Writing out a list of numbers with 2 numbers per line illustrates this clearly. Every number below 2 is not prime. You can see similar eliminations writing out lists 6 numbers per line, or 30 numbers per line, etc.. You can write an algorithm to limit the numbers you check for primality by creating a base consisting of small primes (e.g. 2*3*5=30), called primorial numbers. Then, a pattern array consisting of primes in the range of the base that exclude those primes used for the base (for base=30, [1,7,11,13,17,19,23,29]).
Now you only need check numbers that are 30x+[1,7,11,13,17,19,23,29]. You would only be checking 8 out of every 30 numbers (for this base), an improvement over checking every number!
You can increase this base (2, 6, 30, 210, 2310, 30030, 510510, etc..), but the pattern size increases much more than the payoff. Each increase in the base makes it only slightly more efficient. So, a base of 210 or 2310 is probably sufficient.
|
|  | | I forgot to mention, the pattern should include "relatively" prime numbers to those in the base, not only primes.
You only see this for base primorials after 30. So, for 210, you need to include numbers like 143, which isn't prime, but is relatively prime to 2,3,5,7.
So the pattern should include all numbers between 1 and the base that are not divisible by the primes used to make the base number. |
|  | | |  |  |  |  | Anonymously add a comment: (or register here) (registration is really fast and we send you no spam)
|