The loop is really what I'm focusing on optimizing right now. It is definitely a weak link. The latest OpenGL has plenty of mechanisms for moving vector data in and out of the card with high efficiency. Allow me to break it down
You can send one point at a time
You can send an array of points at a time - array vertexes
You can send the card a system memory location where the vertexes are stored and let the graphics card read from main memory.
You can compile vertex arrays and store them in the card memory which is DDR3 these days as vertex buffer objects which the card can then create instances of. This part rocks, I don't use it yet but I will. Now the graphics cards even have high level languages to do general operations on the card on the vertex data.
This is the fast step. The vector calculations are performed on the card and returned in a feedback buffer object.
Here is the slow step.
I must then step through the feedback buffer, taking the x,y,r,g,b for each point and insert that into a short[], after stepping through the entire returned buffer, I copy that short into a buffer that the sound API is continuously streaming from. As geometry complexity increases, this parsing bogs down and gives the software less of an interactive feel. The first optimization I plan to do is to break the parsing up into smaller chunks instead of the whole buffer at once to reduce waiting for the loop to finish.
It could be in the end that the transactional and GL overheard is so high that I would be better off doing all calculations in software...
I'll post things the way they are and then write an outline for the flow of the program. Eventually I'll get around to adding some comments on the core.