Data Management optimize

The most ignored programming story of 2007

Justin James reveals what he considers the "sleeper" programming story of 2007.

In my previous meta post, I promised to start the year with an article about the "sleeper" story of 2007 and describe why I think it's slipping under a lot of radars. The story is, of course, about multithreading and parallelism.

To the vast majority of programmers, this topic has never been interesting, and it did not make sense for them to bother with it -- after all, with only one CPU core available, multithreading is done through timeslicing at the OS level, which is extremely inefficient. Using more than one thread per process (or process per application) only makes sense in the most trivial of cases (one thread to act as a worker and another to monitor the "cancel" button) or for systems that spawned children to service networking requests, like a Web server or a database server. Most programmers can handle the "cancel" button variety with a bit of effort, and only a select few programmers work on those network server projects. In the last few years, the number of CPU cores (first, logical with hyperthreading and then physical as well) suddenly started going higher than one on mainstream equipment, thanks to the plummeting price of multiple socket motherboards for servers and dual (and now quad) core CPU hitting both the desktop and the server room at bargain bin prices. Sadly, this has been a response to the fact that current CPU materials will set themselves on fire if the clock speeds are increased past their current spot. Thus, packing more CPU cores in a physical chip maintains Moore's Law -- without increasing the clock speed. I predicted that multithreading would become very important, since the speed of single-threaded execution is not going up nearly as fast as it did in the past.

As it turns out, very few programmers seem to care about multithreading. Those who have thought about the topic really do not see the need to use it as a technique in their projects, which are typically written in Java, VB.NET, C#, or PHP. Most programmers are writing applications that perform "data processing" as opposed to applications that perform "computations." The difference is quite important.

Data processing is about dealing with data as an aggregate set (maybe drilling down to a handful of records out of many), and performing trivial calculations at the row level that become quite a task at the data set level. Data integrity and accuracy are more important than speed, although speed is important as well. You can usually spot a data processing application because it does not do anything that a Microsoft Access or Microsoft Excel application does not do -- just not at the same scale. Data processing apps tend to be I/O bound and not CPU or RAM bound.

Computational applications are typically centered around a single, relatively free-form chunk of data, such as an image, movie, sound bite, or even a text document. When they are working hard, the CPU and RAM systems are stressed to the max, but the I/O system might not have a single byte going through it until everything else is said and done. Speed is usually more important than robustness; no one expects to be able to recover if the power goes out in the middle of a lengthy processing period.

Since most of us are working on the data processing-style applications, the multithreading that needs to occur happens either in the parent process (like the Web server that spawned the process that our code runs in) or in one of our child processes (such as the thread in the database server that is processing our request). Most of the work that the code we write does is input validation and output formatting. We rarely even worry about record locking at the application level; as long as we lock the rows in the database, we feel that it is highly unlikely that two people would modify the same record at the same time. We don't even bother to find out what our application server does about locking the Session system so two calls to the same page with different parameters do not cause chaos.

Most programmers don't care (and don't need to care) about multithreading and parallelism because 99% of the parts of the application that need to work like that are written by either the application server vendor or the database vendor. Our work is not computationally intense enough to justify either the CPU overhead or the additional work.

Is there something wrong with this? Not really, as long as programmers are getting a paycheck and enjoying their work. However, it speaks volumes about what tasks programmers get paid to do, and the companies that pay them to do those tasks. I wonder why companies still cannot see past data processing as a task for computers to handle.

J.Ja

About

Justin James is the Lead Architect for Conigent.

38 comments
oloo311
oloo311

what took you so long?

Guitarzan8
Guitarzan8

Blast. I thought this thread was going in the direction of tips and hints of ways that people multi-thread their apps. I read through the 1st dozen and gave up. Sounds like JJ has control over that in his VB environ. I'm a db developer for a 2300 user data warehouse. Websphere Portlets showing SQL Server 2005 table data. Sure would like to speed up my porklets.

Justin James
Justin James

Sorry that we're not on the stuff that can help you out. If yuou are a DB developer, there isn't much you can do. The SQL Server automatically does multithreading based on its needs. In terms of the "porklets" (I like that phrase, I may reuse it down the road), you need to first find out where they are getting held up. Multithreading is not a cure all. In fact, the app server is already multithreading by running multiple page requests at a time. But it would be a good trick to get it to run multiple portlet requests at once, particularly if the DB access is a bit slow, so at least they will generate in parallel, not sequentially. Sorry I can't be of much more use; as you say, I am in the .Net world which is a different story than Java. :( J.Ja

nwalton
nwalton

The problem in handling multi-core systems lies not with the the OS but with the language, C/C++/Java are not designed to handle automated threading of evaluation. To achieve that you will need to drop the concept of store, in which case a thread merely becomes an evaluator of an expression without leaving behind any side effects in memory. In that case it doesn't matter where a thread is executed. Multiple threads can be allocated across multiple machines or multiple cores. Blue sky thinking you might say, not so say I have a look at the work on Termite using Gambit Scheme (http://p-cos.net/lisp-ecoop05/pdf/19654.pdf) or Erlang.

Justin James
Justin James

Why do you think C/C++ have the lack of good threading models? Because C/C++ are historically very tightly linked to the *Nix platform. Indeed, C was designed so the original UNIX could be written. UNIX was not designed to be threaded, it was designed strictly to have these things done in processes. That's why C/C++'s threading model is so kludged together. Java's probably suffers (I'm not too familiar with it) because of the need to be cross platform, if I had a guess, as well as its historical connections with C/C++. Erlang is designed from the ground up for this type of thing (it is used in phone switches); so were Crays (and other supercomputers) and mainframes. This stuff is really just a problem on microcomputers. :) J.Ja

mdhealy
mdhealy

A key distinction to make when thinking about how to use multiple CPUs is between "high performance" computing versus "high throughput" computing. For getting two people from Point A to Point B, a sports car -- a high-performance vehicle -- is the fastest method, but for transporting a group of 25 people to the same destination a bus -- a high-throughput vehicle -- might be preferable. In my own work -- genomics data mining -- I often do large-scale analyses that I run overnight. For these I don't need to bother with fine-grained parallelism via threading, I can just run 16 or 32 or whatever batch processes each of which does a chunk of the whole job.

Justin James
Justin James

What you're talking about can be done with threads as well, but you are right; much easier to just split the *input* into an appriopirate number of chunks, and start a "seperate but equal" process on each chunk. Indeed, in the "data processing" world, that is a very easy algorithm, since the pieces of the puzzle have less to do with each other than in other types of computing. J.Ja

Jaqui
Jaqui

if you took that same data and wanted to generate a 3d mesh, appropriately textured for realisic look and render a "fly by" animation from it, the threaded / multicore process model would be a much better option. it's taking that model away from the extremely cpu time intensive fields and moving it into the normal data processing area that needs to happen before every business will even begin to see the benefits of it and want to have it implemented.

Vladas Saulis
Vladas Saulis

Thank you for very interesting articles on multithreading and parallel processing. You always choose very good and up to date topics. I like your attempts to understand the very essentials of problems. But... I'd like to argue with you and maybe try to criticise some mordern views on threading. I may seem annoying in such my attempts, so excuse me for that. In my point of view, mutithreading is not a good techology for multi-core and parallel programming. It is very inflexible and excessive, and not even nearly as robust as it should be! This is kind of doing things through the back end. The need of sinchronization and excessive locking is only the one side of medal. Lets take your example of movie compression. If you split a stream into two parts, then it could use only 2 CPUs, doesn't matter how many CPUs are in the system. Ok. Split into N parts. Then you probably can take advantage of N CPUs. But what if I add one or two additional CPUs to the system? Or if some CPUs are hard-busy or affiliated for another tasks? Threading parallelizing is done at DESIGN time. And not at *runtime*. It is hard-wired techology, which is quite ok nowadays, but not in [near] future. You cannot write multithreaded application (except for classic listen-accept server app) which would dynamically use all of free/not_very_busy CPUs in the system! So I daresay that classic multithreading model is good only for one-CPU (!) systems. It's normal for mail clients make e-mail fetching and in the same time showing an e-mail body, and even HTML rendering in different threads. And this is all what multithreading is really for. It can be used to some extent in multi-core programming, but in future there must be right opposite concept for parallelizing. You've mentioned data-flow programming model here, and there is an object-flow model (I call it so) as well. Please exuse me for my such (seemingly) categoric reply. And please exuse me for my not very precise English too.

Justin James
Justin James

You are right at the level that MT work is done at compile time. What you miss is that you can have a semi-flexible algorithm to determine how many threads can be run at once. Furthermore, if you run "too many" threads, but have the priorty set low enough, the OS will be happy to postpone the extra ones until resources are free (within reason, of course... the stack space alone for 10,000 threads will kill you). I've done a ton of research in the last few months since we last touched base. Really what you want is either a Cray system, or a mainframe (or minicomputer) like the AS/400, VMS, etc. They do the stuff you are talking about. It's just a model that isn't familiar to most programmers, so it sounds new. :) J.Ja

Vladas Saulis
Vladas Saulis

I can agree that in some simple cases MT is good enough. This is when threading logically divides well into separate tasks. But when you deal with data streaming applications, this division is not very well determined. In your example of video compression into two or more parts there exists a possibility that the total time of all parts' calculation would be even longer than without MT. Lets see: your video compression app is smart enough to see 2 CPUs in the system. So it creates two threads, splitting the stream into two parts. Then it launches these two threads, each on separate CPU. But lets assume that one of them is so busy working with another task that it calculates a compression thread 10 times slower than the other one, running on another CPU. Thus, the total time of compression will be around 10 times slower than the same without threads! What I'd like to underline - there should be some smart adopting to the current system (and CPU) load here. And this is hardly achieved with threading model. Even if you'll make a lot of threads this won't solve this issue. Because at design time you can never predict a real situation in the system when on runtime. BTW, I was working in systems like AS/400 and VMS, and there is nothing different in this aspect. However I don't know much about Cray. What's different is in Cray about that?

Justin James
Justin James

MT is actually a bit different than fork(). The main difference being that threads don't share the state like fork() does. That's why many times, you see code like: pid = fork(); if (pid != 0) { //Or whatever pid the parent process gets on your system //I am the child process } else { //I am the parent process } This model cuts both ways. The good thing is, it is really simple to program against. The bad thing is, it is extremely limited. You can do raw, pure parallelization work like this, but asyncronous processing is a bit more different. So then you get IPC calls like signalling, mutex'es, and so on. Poof, now we're in threading-land, using processes instead of threads. Indeed, it is a bit of work to get threads to work like fork(), which is one of the calls I am looking forwards to seeing in the Parallel Extensions library coming soon for .Net. And no, you are not intruding at all! J.Ja

Vladas Saulis
Vladas Saulis

I spoke about 'affinity' set for the running process by sysadmin or in-script. This is a thing which can happen at random or by intention. And it can affect a MT program you even don't know when and where. There are much more controversal aspects of MT that I hadn't told yet. This, I think, is for much wider discussion, though. Anyway, this is only my personal opinion that I think MT has no any significant future. BTW - MT is nothing different than old fork()->exec() approach from programming model point of view. It only helps to make multitasking a little more lightweight. Please excuse me for unexpected intrusion. If you interested to speak more on this problem - you are welcome to contact me privately at any time.

Justin James
Justin James

Assigning processor affinity in the code is insanely boneheaded. The whole point of having a scheduler is because it works at run time, so your assumptions don't have to be as hardcoded. The only time I would ever set processor affinity is in the case of something like a DB server, where it is expected that it will potentially dominate the system, *and this information is configurable by a sys admin who is performing system tuning*. For example, the SQL Server config lets you set CPU affinity, presumably so that you can leave 1 CPU always free for other tasks, or maybe for some other purpose. But by default, it allows the scheduler to handle things. When you talk about these various processing resources abstracted and such... you are perfectly describing systems in the mainframe and supercomputer class. You might want to read up on stuff like the Borroughs systems, Cray systems, etc. A few weeks ago, I started learning a lot more about these systems out of curiosity... what you've been describing fits these systems perfectly. It is exactly what they do. The programmer doesn't even bother knowing MT, processes, etc. The system *just does it* and assigns resources based on availability and capability. J.Ja

Justin James
Justin James

TechRepublic has this one ad that blows out Flash to 100% CPU time. Thankfully, with multi-core CPUs (like my Core 2 duo), only one core is jammed, and the OS transfers other stuff to the other core, so the system is still responsive enough for me to get the tab with the offending ad closed. Your example has a good point. However, in your example, if the CPU-jamming process is started FIRST, then the 2 threads your app creates will both land on the same core. It will still run a touch slower (timeslicing) than a cingle thread would, but not that bad nonetheless. If the CPU-killer gets started AFTER your app makes its 2 threads, I would imagine that the OS can shift the threads'core if aboslutely needed, but I would also imagine that doing so is a resource intensive process that does not occur unless absolutely needed. I would have to look more into this to find out for sure, and I am sure that the behavior varies from OS to OS. I know that mainframes can do that seamlessly. ;) J.Ja

Vladas Saulis
Vladas Saulis

I took the worst of all possibilities, but it still possible. Programmer may create multithreaded program just prefectly, but then, everything would depend on real run-time conditions (scheduler, other processes' priorities, IOs, etc.) Some systems like Linux even let to tie some process to a particular processor (called 'affinity'), so what in this case could happen to your MT program? You can only pray... So, I'd like to see a system and/or multi-core programming paradigm, which would take advantage of overall processing power within the current runtime conditions. I.e. to use *any* set of free resources. The OS on itself should help to do such self-balancing too.

PTPage
PTPage

I think your example ignores the fact that the scheduler would probably take care of assigning the threads to the most appropriate processor. So, in this case we have two compression threads and another (presumably much higher priority) thread. If the higher priority thread really does monoplize one of your two processors, then the other processor will be time-sliced between the two compression threads. You'll get slightly worse performace from your two threads than from a single one, due to context-switching, but not ten-times worse. However, even my analysis assumes that the movie compression is CPU-bound. If, which is more likely, the compression threads need to do I/O periodically, then the two threads can be working with diffierent parts of the hardware at the same time. Still not as much performance increase as having two CPUs available, but nowhere near as bad as you make out. It all comes down to being able to correctly analyze your problem. Which is what many others have been saying: Multi-threading is a design-time issue. After that, tuning issues like choosing an appropriate number of threads for your thread pool can be a run-time decision. By the way, I've been working on multi-CPU (16+) boxes under Unix for over a decade. But it wasn't until I got out of the data-processing world of a major phone company and into the "real-time" world of a financial software firm that I really started to use multi-threading. In the DP world, multiple processes (one per user) was sufficient.

Jaqui
Jaqui

for a pet project for this year. :D actually, it gives me a criteria to meet in one I was already planning. c based cgi web app. [ I know, I'm being masochistic there ;) ] I'll add an internal db engine to it and see how difficult it is to make he app thread itself effectively, or use multiple cores, instad of connecting to an external db engine.

Justin James
Justin James

Jaqui - Best of luck! I'll see you in a year when you emerge victorious... ;) Actually, it is not *that* bad to do it. I implemented a decent flat file DB in Perl in under 100 SLOC. I would imagine that with some decent libraries (ie: you are not writing "malloc" everywhere...) you could do it in about the same. The threading could be only a dozen or so SLOC; it depends on what exactly you intend for it. I remember when a number of people were writing CGI apps in C/C++. I saw a CGI written in bash once. Nothing wrong with it, it did what was needed. Dangerous as anything, to let anonymous users be running bash like that, but whatever... :) J.Ja

Jaqui
Jaqui

only if the code doesn't pay attention to security. :D then it's as dangerous as any other technology with write / execute access to the system.. like php, perl, python, ruby, javascript... sanitizing visitor input should be done no matter what language or technology is used for a site script. [ biggest security issue for any website script is the visitor input. ;) ]

NickNielsen
NickNielsen

But most managers still seem to think of it as "Dee Pee." Computer use in most businesses reflects this. I think Cray are the only one who are even thinking seriously about this. I would purely love to play with the Cray XMT for a day or two. Imagine it...Doom without latency!

Justin James
Justin James

Crays are specifically desgined for parallelism; that is a defining characteristic of the supercomputer class, as opposed to the mainframe class. Mainframes are optimized for I/O throughput, supercomputers for FLOPs. Likewise, mainframes emphasis stability and reliability, often going years without a reboot, even allowing CPU and RAM swaps (and upgrades!) to occur without downtime. Supercomputers only care about staying stable long enough to produce output; uptime is not a priority. Cray is NOT the only ones thinking about this. Intel is working extremely hard to get the tools into the hands of developers, with the Ct compiler stuff. Unfortunately, that is for C/C++ and FORTRAN developers, and most of us use Java, VB.Net, or C# nowadays. Microsoft is working very hard on this too. They just CTP'ed the "Parallel Extensions" library for .Net, which introduces a raft of features that make it very easy to perform common multithreading tasks in .Net languages, like parallel for loops. This is very similar to what Cray is famous for. Microsoft Research has also been churning out some very interesting stuff; I recently read an article about using a graphics processer as a parallel computing device to produce a statistically sound random number generator. Very neat stuff. Sadly, this is still (mostly) in the "uber nerd" realm of things that the average business programmer is only faintly aware of, since it appears to be outside of their required skill set. J.Ja

mrogers
mrogers

I have long asked myself about this topic. Why do we have technologies such as Dual or Quad or Octo cores if they don't seem to speed up anything? Why can't the O.S. which handles the threads say to itself "Oh look! There's thirty threads on this app trying to run, let's split them up across the CPUs (cores)!" I understand the limitations and lack of possibilities with this setup, but if AMD and Intel are going to get nuts about their multi-core stuff, then get someone like MS and/or Mac and possibly most beneficially *nix developers WANT to write for them. If we have to buy new PCs we want them to run faster! We have Server 2003 running on dual Xeon Quad cores roughly 2.8Ghz in our servers here and I swear they aren't doing anything special. Just lots of it. I have one server that is just for Web. My Intel Centrino Laptop is spunkier than it is! I know these Xeons with eight cores should be handling a LOT more, but they simply can't because nothing is governing the proper useage of multithreading and application efficiency. The OS should in my opinion. Of course they'd have to rewrite the kernel, but WAIT that's what Windows 7 is supposed to be about; a new kernel. Hmm.. I sure wonder!

Dr Dij
Dr Dij

as an end user if the OS just allocated my progs across the processors. Maybe the OS running mostly on one core, my prog that readies 2000 images in batch for web pages on another, ftp to upload hundreds of megs to my site running on a third, firefox and IE web browsers (usually have 6-10 copies open) on another. And I should be able to run VB or Eclipse & other IDEs or the java based skillsoft programming courses that come with ACM.org site while the others are running. (all the while supporting two or more monitors - I don't ask for much.. :) right now the ACM courses time out if I try these all at once. My pasting photos thru various programs exhausts memory and I actually have to bother to stop and close 30 photos at that time. Other programs become unstable if mem becomes low or processor too slow too.. I've ordered a quad core 6600 (2.4g) with 4 gigs, I will be sorely disappointed if this doesn't work after that, compared to my current single processor (1.7g). How much will be due to faster I/O and memory vs the quad core? To me today's PCs are fundamentally flawed when it comes to processors. The mobo and processors should be packed in a modular fashion that allows you to simply open up a case and slap on a processor or three. There are limits to how many processors can be put on the same die. And this becomes more expensive than mass producing the processors and allowing them to be combined separately. (tho performance is slightly lower of course) The O/S should automatically recognize the addition either hot swap or on next boot. And a supervisor / hypervisor / virtual scheduler should allocate tasks among them.

Justin James
Justin James

"I'd be happy as an end user if the OS just allocated my progs across the processors. Maybe the OS running mostly on one core, my prog that readies 2000 images in batch for web pages on another, ftp to upload hundreds of megs to my site running on a third, firefox and IE web browsers (usually have 6-10 copies open) on another." Vista + Core 2 Duo does a good job of this. I've blasted the CPU with some heavy processing editing in VB.Net, using multiple threads, and it handled things quite well. Even when it grabbed 100% CPU, Vista de-prioritixed when I wanted to do other things so the system was sluggish, but certainly not miserable. "My pasting photos thru various programs exhausts memory and I actually have to bother to stop and close 30 photos at that time. Other programs become unstable if mem becomes low or processor too slow too.." I haven't seen these issues since I put this system together. Things that helped were adding ReadyBoost memory (2 times physical RAM for best results), and having 2 RAID 1 drive arrays; one for OS/apps, the other for data. If I wanted ultra performace, I would add tiny RAID 0 array just for swap, but the ReadyBoost does a great job with that regardless. "How much will be due to faster I/O and memory vs the quad core?" Most of your gain will not come from the CPU. Most of your slowdown is not because your CPU is jammed, or processes/threads are competing for CPU, they are competing for RAM and disk access, probably for swap purposes. If you're pumping a lot of data over the buses by doing things like cppy/paste of big items, what happens is that the physical RAM gets overrun, hitting the swap file... causing the system to be disk bound. Then before swapping is fully done, you go to another app (more swapping as it comes from the background to the foreground), then your paste copies from swap (or worse, it's still trying to get it out of swap, so it needs to wait for that too!) and then add it to the recipient's heap... again, more swap! "To me today's PCs are fundamentally flawed when it comes to processors. The mobo and processors should be packed in a modular fashion that allows you to simply open up a case and slap on a processor or three. There are limits to how many processors can be put on the same die. And this becomes more expensive than mass producing the processors and allowing them to be combined separately. (tho performance is slightly lower of course) The O/S should automatically recognize the addition either hot swap or on next boot. And a supervisor / hypervisor / virtual scheduler should allocate tasks among them." Mainframes and supercomputers did this. Literally, decades ago. If you want your personal PC to do this, move to a mainframe-grade *Nix that will run on the x86 platform. I doubt you can get HP-UX or AIX on there, but you can get Solaris. Solaris supports things like hot swap/upgrade of CPUs and RAM, it treats them like drives in a RAID... you tell the OS to "down" the device, you do your swap/upgrade, tell the system to bring it back up. That's my understanding, at least. I've only been a user of Solaris, not an admin. Mainframes like AS/400's handle this automagically, and have for literally decades. That's why you can see systems where uptime is greater than 10 years, but it has modern hardware. Because all code is bytecode (like th eJVM or .Net CLR), these systems can undergo in-place OS upgrades, swaps from RISC to CISC architecture, and so on *while still staying up* and *without the programs requiring a recompilation, or even being stopped*. And they are immediately optimized for the new hardware. Impressive, eh? Too bad we can't get that on our desktops. Then again, up until a few years ago, the hardware just wasn't there, and it was too expensive. J.Ja

Tony Hopkinson
Tony Hopkinson

A lot of developers aren't capable of doing it. It takes longer, it's harder, it costs more, it's much harder to debug and can make future enhancements problematic to say the least. It's quite easy to paralyse a program by making it parallelise, even if you know what you are doing. The continual dumb down of the discipline, that is the cookie cutter approach to development has left the industry with less people who can do the job, not more as those Gartner types would have us believe.

Absolutely
Absolutely

[i]The continual dumb down of the discipline, that is the cookie cutter approach to development has left the industry with fewer people who can do the job, not more as those Gartner types would have us believe.[/i] As usual, you make a good point, with an extra helping of snark. "Gartner types" probably do increase the market demand for half-baked, marginally proficient programmers, but the responsibility to prepare [b]oneself[/b] to deliver more than those "Gartner types" know to expect, or even know how to describe, is the responsibility of the programmer, not of the "Gartner types."

Justin James
Justin James

"open, secure, convenient, pick two". When I say "open" I don't mean "open source", I mean accesible to outside code (macros, APIs, libraries can be linked to, OLE, DDE, COM, etc.). "Convenient" means "easy to use without having to jump through hoops". Every system that has been both secure and open (like *Nix) is a pain to work with (that's why so many people trash the security model with SETUI, sudo and such). Convenient and secure systems like the Mac are hardly open. And open and conveneint systems (MS Office) are not secure. J.Ja

SnoopDougEDoug
SnoopDougEDoug

In hardware the saying is: Faster Smaller Cheaper Pick any two. I wonder what we would say in software? More features Sooner Higher quality Pick any two?

Tony Hopkinson
Tony Hopkinson

but the fundamental idea is sound. My experinece with it on a practical level however has been patchy. Ranging from wildly successful, to management telling me I should be more aglie, while nailing my feet to the floor. Has to come from the business, unfortunately many view it as some sort on internal IT process, that will realise lot's of benefits for no outlay. I remember asking for a moscow recently, came back so far out of the original scope I thought I had the wrong document and everything had M against it. :(

Justin James
Justin James

... is from Ghostbusters: "I've worked in the private sector. They expect results." They may expect results, but frequently fail to provide an environment in which results can be delivered! In fact, it's the focus on getting "something now" rather than getting "something right" that produces the situation. This is the basis for a lot of the Agile stuff... impossible to change to users/customers, so find a way to work with them. It's a good thought, it really is. J.Ja

Absolutely
Absolutely

What part of that would be snarky?

Tony Hopkinson
Tony Hopkinson

Quality is an overhead in commercial software development.

Justin James
Justin James

... the OS *does* handle this. I am writing from the angle of the developers not making threads to do work. For example, on a dual core machine, if you are, say, compressing a movie, it makes a lot of sense to split the movie in half and compress half of the frames in one thread and half of the frames in the other thread; the OS is smart enough to put each of those threads on a different core, so each core is cranked at 100%. The end result? The work gets done in half as long as it would take if the movie was compressed ina single thread. But developers are not doing this kind of thing. Another optimization is to split lengthy I/O requests (DB access for large or tough queries, downloading of large files, etc.) into a seperate thread while the rest of the computations run, and re-join the I/O thread when it's output (or input) is needed. Again, a lot of folks aren't doing this. Until they start doing it, there is nothing for the OS to handle! I guarantee that of the 30 threads you see, nearly all of them are idle threads, basically event handlers waiting to be woken up. J.Ja

GoodOh
GoodOh

Sorry if I'm simply demonstrating how thick I am (wouldn't be the first time). mrogers is talking about 30 threads 'trying to run' which isn't how I would describe threads waiting for an event trigger. It seems to me we are suggesting the key issue isn't the need for OS enhancement as mrogers is seeking but rather for developers to better multi-thread their apps. Are we suggesting the mrogers is looking in the wrong place? That's what I would have thought but the original post lead me to question myself.

Tony Hopkinson
Tony Hopkinson

Doing any multi-threading would be a start. If I was writing a program to to collect data, with a print option, as a programmer, I'd probably do the printing in a separate thread and a good OS could make the decision to run it on another processor. But to provide that option to the OS, I have to isolate the printing code and 'mark it up' as something that can be run at the same time. Writing an OS that could 'look' at the code and mark it up itself..... There's some strides in providing 'hints' to the OS, but they don't make the problem any simpler. Also the Functional Programming paradigm has good possibilities, but that's an optional extra in cs courses and well beyond the skill set of the once wrote a macro MBA's businesses like to employ to code.

GoodOh
GoodOh

I am probably not 'getting it' but my understanding is that OS X 10.5 is doing much of what I think you are asking for. [url]http://www.intel.com/intel/osx/index.htm?iid=homepage+news_osx[/url] It only just made 2007 but it is out and about with a multi-core aware kernel. Perhaps you could explain what is missing so I can understand better the gaps between marketing and reality.