Software Development

The programming paradigm needs an update

Find out what Justin James really dislikes about programming, and then share your ideas about what direction the programming paradigm should take in the future.

The way I view programming is different than it was a week ago due to various factors. First, I have spent a lot of time lately with my head buried in the .NET documentation, and something in the back of my mind said, "If C# is so much more efficient than VB.NET, why do all of the code samples appear to be the same length and look virtually identical?" Then, I reread an e-mail from someone telling me that XYZ language was awesome because it needed dramatically less source lines of code (SLOC) to do the same thing as more mainstream languages. And, I have still been multithreading the same dumb little piece of code. Finally, I read a post by Mark Miller about Emacs and Lisp.

It is all coming together for me now, the conscious understanding of what I really dislike about programming. I got into programming because I like solving problems, but solving problems is really only a small fraction of the work. The rest of it is giving exact, precise details as to how to perform the tasks that solve the problems. It is like painting the Sistine Chapel by putting a seven-year-old boy on a scaffold and dictating to him every stroke you want him to make. It is ridiculous.

This highlights the real problem in the programming industry: Everything we slap on top of the development process is more garbage on top of a rotting foundation. Let's take a closer look at what we do here, folks, and examine how much the programming paradigm has (or has not) changed.

Then (i.e., 10 -20 years ago)

A programmer sits down and analyzes the process of the work. Using a stencil and a pencil, he lays out a workflow on graph paper. The programmer sits down, fires up his trusty text editor (if he is lucky, it supports autoindent), and creates one or more text files containing source code. He then compiles the source code (or tries to run it through an interpreter) to check for syntactical correctness. Once it compiles or interprets correctly, he runs it with test cases to verify logical correctness and to check for run-time bugs. Any bugs are logged to a text file; or maybe he has a debugger or crash dump analyzer to find out what went wrong. This continues until the program is considered complete.

Now

A programmer sits down and analyzes the process of the work. Using a flowcharting tool or maybe a UML editor, he lays out a workflow. The programmer sits down, fires up his trusty IDE (if he is lucky, it supports version control), and creates one or more text files containing the source code. Some of this code may be auto-generated, such as data objects based on the database schema or some basic code from the UML. Half of this generated code will need to be discarded unless the project is very basic, but at least it is a start. The IDE will handle the basics of getting a form to display regardless of whether it is a desktop app or a Web app. The programmer then compiles the source code (or tries to run it through an interpreter) to check for syntactical correctness. Once it compiles or interprets correctly, he runs it with test cases to verify logical correctness and to check for run-time bugs. Any bugs are logged to a text file; or maybe he has a debugger or crash dump analyzer to find out what went wrong. This continues until the program is considered complete.

Wow... we've traded in vi for Visual Studio and a bunch of print statements for [F11]. At the end of the day, nothing has really changed except for the tools, which are barely keeping pace with the increasing complexity of developing software. We are stuck, and we are stuck badly.

The Lisp advocates like to say, "Lisp would have taken over if only we had better/different hardware!" Now we have the hardware, and Lisp is still not taking over. I assert that we need to go far beyond something like Lisp at this point; we need to get into an entirely new type of language. A great example of this is multithreading.

IT pros usually say two things about bringing multithreading work mainstream: Compilers need to get a lot smarter before this can happen, and we cannot make the compilers smart enough for this to happen well. If this seems like an irresolvable contradiction, it isn't. The problem is not the compilers -- it's our paradigms.

If you've spent a fair amount of time with multithreading, you know that it will never become a commonplace technique until the compiler can handle most of it automatically. The fact is that it's tricky for most programmers to keep the details of a system where concurrency is an issue -- and debugging these systems is a pure terror.

Smarter compilers are the answer, but there is a truth amongst the compiler folks that compilers will never get that smart. Why? Because properly translating a serial operation into a parallel one requires understanding the code's intentions and not just the step-by-step how to guide that the source code models. I can read your code and (provided it is clear enough) understand probably 80% of your intentions up front and possibly figure out an additional 10% or 15% of your code with time. The other 5% or 10% will be a mystery until I ask you.

Read the comments on this post from Chad Perrin about Python and Ruby; these people are trying to optimize a trivial chunk of code. What is even more interesting is the point Chad makes over and over again in his responses (he posts as apotheon): Maybe the intention of the code was to produce that onscreen counter that is eliminated in most of the optimizations. Or maybe, despite it being slower due to the concatenation of immutable strings, there was a knock off effect of that as well. Who knows? The intentions cannot be made through code without having ridiculously verbose comments in the code or having an exacting product spec available.

Perl appears to understand intentions, but it is a trick. Perl was simply designed to make certain assumptions based on the way most programmers usually think in the absence of logical code. If your intentions do not square with Perl's assumptions, it either will not run, or it will not run properly. If you do not believe me, try abusing Perl a little with a record format that specifies the tilde as a record separator instead of newline. Perl's assumptions fly out the window unless you set the record separation string in your code; and, at that point, you are no longer actually using the assumptions -- you are giving it precise directions.

I am all for self-documenting code. I like to think that no one reading my code had to struggle to figure out what it does. But aside from design specs and/or pages of inline comments, there is no way for someone reading it to know why I wrote the code that I did. It is like being in the Army. They usually don't tell you why you need to march down a road, they just tell you to do it.

Programmers are in a bind. Without a way for the compiler to know the design specifications, the compiler cannot truly optimize something as complex as a parallel operation beyond really obvious optimizations. There are no hard and fast rules to making a serial operation parallel; it is a black art that involves a lot of trial and error even for frequent practitioners. In order for the compiler be able to understand those design specifications, it would require an entirely different type of language and compiler. Assertations and other "design by contract" items are not even the tip of the iceberg -- they are the seagull sitting on the iceberg in terms of resolving the issue. If the compiler is smart enough to understand the design document, why even bother writing the code? UML-to-code tools generally try to do this, but they do not go far enough. At best, these tools can translate a process into code that literally expresses that process; there is no understanding of intention.

There are a few major schools of programming languages -- object oriented, procedural, functional, and imperative -- all of which have their strengths and weaknesses. Functional and imperative languages come the closest to what I am talking about. SQL is a great example. You tell it what you want not how to get what you want. The details of the "how" are up to the database engine. As a result, databases are optimized in such a way that they run a lot faster than what 99.9% of programmers could get the same stuff to run (not to mention redundancy, transactions, etc.) despite their rather general purpose nature. Even SQL shows a lot of weakness; while it is an imperative language, it is extremely domain specific. As soon as you want to manipulate that data, you land yourself in the world of procedural code either within a stored procedure or your application.

Most applications (and every library) become a miniature domain specific language (DSL) unto itself -- not syntactically (although some, such as regex libraries, reach that status), but in terminology, subject matter, and so on. The problem you are most likely trying to address has absolutely nothing to do with the language you are using and everything to do with a specific, nonprogramming problem.

A great example of the disconnect between code and the problem it addresses is the subject of loops. I bet over 50% of the loops out there go over an entire data collection and do not exit without hitting every single element in the collection. The logic we are implementing is something like "find all of the items in the collection that meet condition XYZ" or "do operation F to all items in the collection." So why are we working with languages that do not deal with set theory?

Programmers end up implementing the same thing over and over again (i.e., a function or object or whatever in a library that takes that big set as the input) and implementing a filter command that returns a smaller, filtered set, or an "ExecuteOnAll" method that takes a pointer to a function as the argument.

When the language folks try to make it easier on us, the result is stuff like LINQ, the welding of a DSL to a framework or language to compensate for that language or framework's shortcomings in a particular problem domain.

The only way LINQ could get implemented is if the languages that support it have closures, which are "the thing" in a functional language like Lisp; closures are even important in Perl. But programmers have been working in Java and Java-like languages for the last five  - 10 years (depending on how much of an early adopter you were), and a lot of folks missed Perl entirely, either sticking with VB and then the .NET languages or starting with C/C++ (or Pascal) and going to Java and/or .NET. The only exposure most programmers ever had to a truly dynamic language is JavaScript (aka ECMAScript). (By dynamic language, I mean a language that can edit itself or implement or re-implement functionality at run-time. JavaScript, with its support for eval(), has it and so does Perl. In functional languages like Lisp, this is all the language really can do.) In reality, JavaScript is not that bad; in fact, I like it quite a bit. I am just not a huge fan of the environment and object model programmers are used to seeing it in (the browser DOM).

Most of the programmers I have dealt with have little (if any) experience with a dynamic language. The result is that they don't have the ability to think the way you need to about a dynamic language. These programmers stumble around blind; they intuitively know that they aren't working to their potential, but they have no idea where to start. (Why else would there be the huge market for code generators and other gadgets to improve productivity by reducing typing?)

I do not believe that Lisp and its ilk are the answer; they are too difficult for most programmers to read through and understand, and the entire code is a back reference to itself. Procedural languages have also proven hard to maintain. Object-oriented languages are the easiest to maintain -- at the expense of being insanely verbose, which is another reason for the code generators. Imperative languages are almost always going to be little more than DSLs.

Where do we go from here?

I want to see programmers headed towards intentional languages. I have no idea what they would look like, but the source code would have to be a lot more than the usual text file and would need to be able to express relationships within the data format, like XML or a database do. The programming paradigm needs to be able to quickly reduce any business problem to an algorithm so that the real work can go into writing the algorithm and then, at the last minute, translating the results to the domain's data structures again. The programming paradigm would look like a mix of a lot of different ideas, but syntax and punctuation would have to be fairly irrelevant.

What are your thoughts about the programming paradigm? What direction do you think it needs to take?

J.Ja

About

Justin James is the Lead Architect for Conigent.

219 comments
bcroner
bcroner

Maybe it's not the tool set. I think either you can or you cannot program. Further, I think programmers are separated from paycheck chasers from the tools. A true programmer could produce his work on a text editor from scratch. Paycheck chasers rely on IDEs and available code samples to copy-and-modify their work.

hercules.gunter
hercules.gunter

It seems to me that the real issue being debated here is not how compilers do what they do to support programmers, but the attitude of the programmers and how they do what they do. You have a problem of communication and synchronisation between your threads. Now that you've identified the problem, solve it! If your solution is good, it may even be marketable. But don't expect the compiler to do it all for you - it can't possibly "understand" every problem domain out there. If there isn't a solution to your problem "out there", analyse the problem, then design and implement a solution. So you need to . By this you mean, precisely ... . A mechanism for achieving this would be ... . And now you're quite a long way to a solution: do it yourself! If it's really difficult, you'll reach the point where you can't see the next step. Write down what you've got, preferably not as notes but an explanation to someone else of what you have (ask them for help, and good luck with that), then set it aside tolet your subconscious work on the problem, and come back to it another time. (The toughest problem I ever solved took me ten or more goes several days apart, at the end of which I had 20 lines of recursive code which worked like a dream, the first and every time. Those 20 lines were coded in a matter of minutes, but each one represented a couple of hours of thought! But it was worth doing!)

rclark
rclark

Thanks so much everyone for sharing. A lot of what has gone on in the last few years now makes much more sense. I could jump in and give you years of theory, but it wouldn't help much. I can point you in the right direction by saying imagine the waterfall. Imagine OO is only putting the water in bottles. Makes working with it easier, but it's still just water. Now move the waterfall to the space station. Nothing works anymore. Concurrency becomes a real problem. Data splinters and becomes unmanageable. The OO people will be able to horse their collections together, but working with the data is still a pain in the new environment. When the water spliters it acts like a bowl of marbles without rules. Any time external forces act, it sets off a chain reaction of activities. Concurency goes out the window because everything reacts. This is the environment we are living in. And we attempt to keep up by programming in notepad, VS, or VIM. No wonder there are people who are frustrated.

Vladas Saulis
Vladas Saulis

I think that even fully automated multithreading leads to nothing, bacause of its internal 'locking' nature. Locks and need of constant synchronisation between threads is what makes it hard (even for hardware and OS). Recently I wrote an article on new (or maybe not so) programming model. You may find it at http://www.prodata.lt/EN/Programming/OPU_computing_model.pdf I'd like to get your feedback on it. Does it fit your visions?

techrepublic
techrepublic

I'm not a programmer. At least not like most of the posters here. I haven't spent years cranking code and polishing my skills in this or that language, this or that OS, this or that environment. What I do is solve problems. Sometimes (most of the time) a microprocessor or microcontroller or some kind of electronic or electrical or mechanical logic is involved, and a programming language is used (VB, assembler, C(++), Forth, etc.). Some of the time, multiple simultaneous processes need to be handled. So I understand the nature of the problems you folks are hired to solve. But... This is not really a programming paradigm issue. If I understand it, doing multi-threaded programming for a multi-processor machine is a bitch. And it would be real nice if there were tools to make the job easier. Do any of you remember when the ALU and CPU were separate chips and had to be programmed separately? Or am I dating myself? I think the paradigm that needs changing is in the evolution of the hardware. Parallel processing is a great idea that evolved from the notion that since it CAN be done it should be done. And while I can't fault the concept of it, I can fault the way the hardware manufactures have designed it. Hardware is cheap. Most of my projects use $2.00 microcontrollers. The way I string them together and teach them to talk to each other, plus the way I design in redundancy and fault tolerance allows for a pretty easy multi-threaded process. OK, I KNOW you folks don't work with this kind of hardware, this environment, this language, this anything! But, eventually, when the hardware mfrs (and their customers) realize that the way they have designed their multi-processor systems is not cost effective in terms of programming, they WILL change. So, hang in there. Change is on the horizon. It always is.

binghamc
binghamc

I began reading this article with interest as you describe the process of writing an application and how outdated the model is. My interest peaked when you showed the similarities between coding in circa 1970's to now. You are exactly right. However, you then began talking about multi-threaded programming as some sort of new wave even resorting to calling it a programming paradigm and then went onn to compare it to intential programming. What you have described absolutely in no way changes the programming process you just finished describing above. You could not be more wrong on that comparison. Yes there are benefits to multithreaded development...yes there are differences in how you partition your application and code the interaction between your objects..yes there are differences in how you have to think about re-entrance and paralellism, however you still write code the exact same way with an IDE or in a text editor and design your application with UML or some other tool. There is no "intention" related to this process of software development. Your "intent"...I think is in the right place however. Intentional programming referes to writing instructions at higher levels of abstraction and then letting compilers or code generators figure out what you 'mean to do' - your intent. These generators then figure out how to write the lower level code to accomplish the tasks. Read this Wikipedia entry on IP. It provides a good overview on the concepts and a bunch more links to other IP resources: http://en.wikipedia.org/wiki/Intentional_programming Also, the article that got me going on IP was this one from Technology Review: http://www.technologyreview.com/Infotech/18047/?a=f

dhensarlingQ
dhensarlingQ

I love all the buzz words and lofty talk about about parallel(two) or serial(one) lines of thought. Here is what is wrong with the programming languages. They were developed by people who do not program like they talk or think. For example ,instead of a data grid,data binding and all that other jargon, why not get what you say? Sure, there has to be some syntax rules, but in plain english. If I want to get data from excel, why not something like "get cell data" for a statement. A language should be written so the average educated person who has never programmed can look at the text and realize the intent of the code. I program in an apt language for numerical machine controls. To move a tool from point a to point b,the syntax is move,a,b. Tough huh? With all the technology we have, why is this not possible?

Absolutely
Absolutely

[i]In order for the compiler be able to understand those design specifications, it would require an entirely different type of language and compiler. Assertations and other ???design by contract??? items are not even the tip of the iceberg ??? they are the seagull sitting on the iceberg in terms of resolving the issue.[/i] Still an excellent image to make your point, and got me thinking about nesting birds & nested statements. It seems simple enough: once a compiler encounters any conditional statement (if/for/while) with any others nested within it, if it has two processors ("cores" they're calling them this year, whatever) it should use one to check the innermost and the other to check the outermost, so that when the outer statements are evaluated, the inner ones are already evaluated. Or, taking the program as a whole, one core tackles the innermost condition of each nested set while the other works from the opposite direction, and they meet in the middle, avoiding altogether some statements or modules that are not called due to the initial conditions of the current execution of the program. But high-level code can't be easily compiled in parts like that, as I understand it, and if it was broken into parts for compiling, we'd all pity the fool who tries to put it back together. The overhead of such algorithms would almost certainly eclipse the gain imagined by full utilization of multiple processors. C#'s JIT compiling strategy does something along those lines, for a measurable improvement in execution time, but even the most enthusiastic marketing department writers cannot hide the fact that two cores/processors are nowhere near twice as fast as one, other things being equal. The programming paradigm cannot efficiently utilize parallel processing for most tasks. [i]I can read your code and (provided it is clear enough) understand probably 80% of your intentions up front and possibly figure out an additional 10% or 15% of your code with time. The other 5% or 10% will be a mystery until I ask you.[/i] You suggest a very interesting new paradigm. [i]So why are we working with languages that do not deal with set theory?[/i] Is there a language that [u]is[/u] based on set theory? Are you preparing to make one? It seems like it could work. [i]There are a few major schools of programming languages ??? object oriented, procedural, functional, and imperative ??? all of which have their strengths and weaknesses.[/i] All of those share the basic assumption of serialization, ie events in sequence. To illustrate how under-utilized parallelization is, consider what "while" loops do! The word "while" has nothing to do with 2+ [u]simultaneous[/u] events. It translates to English more like "as long as", and when I think of writing concurrency into a program, I draw a total blank. This is a mind-bender of a topic, Justin. Thanks.

Justin James
Justin James

In a way, yes, the attitude of programmers, not the technical capabilities of compilers, is the issue. After all, how many programmers are even *thinking* about these kinds of issues? J.Ja

Tony Hopkinson
Tony Hopkinson

How do you parallelise a recursive routine. :( Test it, debug it and extend it being three other problems as well. Especially the last one. I won't even mention stacks, garbage collection... Short of very well understood, and effectively fixed solutions such as walking a tree, I avoid recursion like the plague.

Justin James
Justin James

Interesting analogy, but apt. Glad you like the conversation around this! I think a lot of the folks who are strictly "business programmers" are not really going to be seeing this stuff, but that other 5% of the code out there drives a ton of what goes mainstream down the road. OO is a great example. :) J.Ja

Justin James
Justin James

Vladas - Your idea has some merit to it, but I think that is would be extremely difficult to fit on top of existing OS and hardware infrastructure. Without (at the very least) an OS designed around this idea, it would be extraordinarily difficult for an application to understand the CPU capabilities, for example. But I do beleive that the idea has merit to it, and does look like it would significantly simplify many areas of multithreading. J.Ja

apotheon
apotheon

It looks like you're trying to reinvent Alan Kay's vision of OOP as a client/server model of computing.

Justin James
Justin James

My uncle works at the level you do, so does a friend of mine. What I find fascinating, they are all mystified by the code that I write, yet I am mystified by the stuff they work with, even though we all know that at a very high level, they are the same thing. You deal in voltage differentials, I deal in variables changing values. I really like the way Frank Herbert's books often expressing "programming" as the physical manipulation of very advanced electronic components, rather than writing code. It gives it a tangible feel that I like a lot. Sadly, it is fiction. :) J.Ja

fgoodrum
fgoodrum

I agree I find that every road block to completing my current project, and consequently getting paid, is in the notes for the next release.

Justin James
Justin James

I use multithreading as an *example* of the failure of the current paradigm, not as a new paradigm. I know that it is not new... but it is becoming much more important as dual, quad, and more core CPUs replace the cingle core, single socket PC in server rooms and desktop/laptop PCs. But it is rediculously difficult to express parallelism in the current paradigm, even for simple tasks! Regarding "intentional programming" as Charles Simonyi discusses it... I love everything I have read about it. I definitely should have explicitly referred to it, since it has been so influential on me. But I am also waiting for them to release something. :) And yes, that precise article in Technology Review is what got me interested in it too! J.Ja

binghamc
binghamc

No I can't spell either. Should be Intentional...as was my intent :-)

apotheon
apotheon

[b]list_of_numbers = [1, 2, 3, 4, 5] list_of_numbers.each {|number| print number }[/b] That was Ruby. Pretty simple, very readable. It's a lot like natural spoken language.

Justin James
Justin James

"Is there a language that is based on set theory?" I beleive that APL is either based entirely on set theory, or just incorporates a ton of it. And languages like Lisp/Scheme, OCaml, etc. that function on tuples are essentially using a bit of set theory. "All of those share the basic assumption of serialization, ie events in sequence. To illustrate how under-utilized parallelization is, consider what "while" loops do! The word "while" has nothing to do with 2+ simultaneous events. It translates to English more like "as long as", and when I think of writing concurrency into a program, I draw a total blank." GREAT EXAMPLE, THANK YOU! (caps intended) Take a look at this code: bEndCondition = false; while (!bEndCondition) { print 'Hello!'; bEndCondition = true; print 'Good bye!'; } The way most (non-programming) people think, this code would *never* print "Good bye!". Why? Because as you say, "while" means "as long as"; in a nutshell, "while" would not be a loop at all, but a condition attached to every item within the block, and optionally looping. That would be a great convergence of matrix algebra and code; the "while" statement would function as a *multiplier* to the vector of statements; the moment that multiplier hits "0" (the while no longer functions), it would zero out all of the values in that vector, *effectively acting as a transactional rollback*. In a language with lazy evaluation (combined with a "flush" operator on the eval chain), this would be an extremely useful way of working, and one that much more closely resembles what people really think of. It sure would make concurrency a heck of a lot easier (and efficient!) too. "This is a mind-bender of a topic, Justin. Thanks." You're welcome! It's nice to see folks getting excited about the same things I do. :) J.Ja

apotheon
apotheon

"[i]All of those share the basic assumption of serialization, ie events in sequence.[/i]" Actually . . . no, not really. Not in the case of OOP and functional programming, at least. OOP, in Alan Kay's conception of it (and he was the guy that coined the term "object oriented programming", so he probably knows what he's talking about), assumes precipitating events and operational dependencies, but allows for infinite parallelization with arbitrary ordering of events between the two extremes, each "object" in the system essentially its own discrete operator. The lambda calculus, from which the functional programming model was derived, similarly supports a model of computation with a precipitating event at one end and an evaluative point at the end where all the operational dependencies have been met, complete with arbitrary ordering of events with effectively infinite opportunity for parallelization in the middle. What assumes serial events in the middle is the way most people [b]think about[/b] programming, including the people who design the languages we use (thus making the sort of parallelization-by-default that should be second nature to us an artificially and unnecessarily difficult proposition).

tuomo
tuomo

Agreed except I would add all levels to "who even thinks issues on this level" SW designer should but they don't have to know the (hard) system architecture, now the developer has to know how to use the systems available. Let me give an example. Report writing systems are a perfect example where you can have parallel processing and get huge returns in time and resources used. If you look (a horrid) an old fashioned flow chart of an reporting system you only see functions. Now, if the system supports tasking(threading) in multiple processors, I/O detached from CPU, even channels separated from each, etc ( hint, a mainframe or server cluster ) it is "easy" to see how separate information to create a report can be processed in parallel. Now, think a transaction as a report and you start finding where parts of it can be processed parallel, one may not give much but a couple of millions/day will. Agreed, smaller entities are more tough but there the compilers may come to help, see for ex. OpenMP or even age old vectorizing FORTRAN compilers and almost every compiler does something depending on platform/OS support. Writing LISP would really make it easier but at least I don't know any LISP which offers any help parallelizing and even in some Prolog you have to design your predicates grouped the way they can run in parallel. This is an interesting conversation but I think sometimes we look too closely just one program when the real benefits can be on higher level, exceptions exist of course.

Mark Miller
Mark Miller

[i]How do you parallelize a recursive routine?[/i] That's a tough one. The basic definition of recursion, in terms of implementation, is "iterating with a stack". I think it could be done, but depending on how complex the recursive function was, it might not be worth it. Ironically I think it would be more effective the more complex it is; the more stuff that would happen before the recursive call. If you take the Fibonacci function I don't think parallelism helps much with it. It's a few tests, and then a recursive call. The tests could be done by separate processors, but in my view recursion is inherently sequential. Maybe there's a way to optimize it.

Justin James
Justin James

In and of themselves, recursive routines tend to not multithread too well. However, a functional language, particularly one with lazy evaluation, will multithread like a dream. And recursive functions are a hallmark of those languages. J.Ja

Vladas Saulis
Vladas Saulis

On the first sight it looks like this. But it must be implemented on language interpreter and operating system level. Application programming should be indifferent to this model. This is exactly [i]What to do[/i] instead of [i]How to do[/i] paradigm.

Justin James
Justin James

... what I really had in mind when I wrote this was "The Intentional Stance" by Daniel C. Dennett, which discusses the concept of beleifs, and my beleives about your beleifs. Very, very interesting read. J.Ja

Justin James
Justin James

VB.Net is pretty symbol-free, so someone can understand its meaning with little-to-no training. The difference between a hard-to-understand language (for a non-programmer) and one that is easy has to do with the reliance on symbols vs. words. The first time I saw Perl, I understood it pretty quick, except for the regex's I saw. I did not understand why the "equal tilde operator" involved so many forward and back slashes. :) J.Ja

Justin James
Justin James

You make a great point about OO and parallel code. There is precisely nothing inherent about OO that dictates that it needs to be serial code. But guess what... every OO language (that I am aware of) operates serially with a message pump or event queue of some sort ("pump" and "queue" both mean "serial operation", of course), and doing parallel work is actual effort. In reality, it could (and should?) be the case that actual method calls can be called asyncronously if the method asserts that is can be. I honestly can think of no reason for OO to require more than an assertation on a method for the system to fire the item in a seperate thread. But it would require a much better OO languages than the ones floating around the mainstream currently! In other worlds, the caller needs to be able to act like a taxi dispatcher, not a math tutor. :) J.Ja

Mark Miller
Mark Miller

Conceptually this is true. I can imagine a Smalltalk-like system where code within methods is executed concurrently by default (from a single method call), since everything operates by message sends. Rather than making message passing synchronous, messages could be passed asynchronously. It would require some tweaking. Some fundamental semantics of objects would have to change in the system. I think it would have to queue up message receipts for the same method in the same object instance, rather than letting it be executed by multiple objects simultaneously. That would lead to race conditions. I think a continuation: method would need to be added to Object so that sequential execution could still happen. But yeah, I think it could work. :)

Absolutely
Absolutely

...as long as the programmers are paid bonuses for delivering functionality quickly and not for delivering it such that the processor is maximally utilized for the shortest possible time, the serialization inherent to linear thinking will continue to be written into our programs. In other words, I agree completely. "What assumes serial events in the middle is the way most people think [the sentence could stop there -- It's the way we think strictly linearly & superficially, focusing disproportionately on the most immediate stressor, usually a "deadline" at work] about programming, including the people who design the languages we use (thus making the sort of parallelization-by-default that should be second nature to us an artificially and unnecessarily difficult proposition)." I think I would have enjoyed learning programming from you and/or Justin. I never saw a fraction of this level of passion for their craft from the professors who were paid to teach programming at a certain university nearby.

Vladas Saulis
Vladas Saulis

Yes, Fibbonacci is not a very good example. I was talking only about the principle. But if this "addition" operation becomes a big and complex operation, that would have sence. Regarding context switching - it only exists in current threading and multitasking envirinments. I can't forsee any context switching f.e. in my model. There are no processes (in today's meaning) which must be switched, there are only virtual tasks in it. However, there could be another similar issues working with different processors and the queue. But you are free to parallelize your tasks at an appropiate granularity level, which would brought you a real sence for parallelization.

Justin James
Justin James

Regardless of the model used, parallelisation of code only makes sense when the cost of context switching and data syncronization are less than the muscle that additional CPUs can bring to bear on the subject. It is difficult to imagine a hardware setup so geared towards inter-CPU communication and selective memory sharing in which this makes sense for the Fibonnacci sequence, since none of the calculations are terribly tricky. J.Ja

Vladas Saulis
Vladas Saulis

I agree in that this isn't possible when thinking in terms of Functional Programming.

Tony Hopkinson
Tony Hopkinson

How do you see splitting the calculation of a fibonacci over different processors whether done recursively or rolled out in to a loop. In current OS's which are essentially serial or multi-serial, any gain, no matter how long the sequence would be lost in the cost of 'switching'

Vladas Saulis
Vladas Saulis

While I agree that Fibbonacci might be unparallizable task by itself, I think that in new multi-processors environment we should try to load-balance the whole system. Every iteration of Fibbonacci could execute on different (least loaded) CPU, so in total the system would have better performance. By a side-note, IIRC, there exists a non-iterative (though approximate) Fibbonacci sequence calculation formulae.

Absolutely
Absolutely

[i]Comments are not quite the same thing as documentation... particularly when working with a third-party library, or something you do not have the source code for![/i] ...I don't plan on working on it! Your points are all valid, but it's amusing how different our assumptions obviously are.

Justin James
Justin James

Comments are not quite the same thing as documentation... particularly when working with a third-party library, or something you do not have the source code for! Comments are for the people maintaining the code after you, including yourself. Documentation is for the person working with what you've written as a user. For example: Class Foo Dim Shared oComputeLock as New Object Public Function Bar As Boolean SyncLock oComputeLock 'We want to ensure that only one instance of the class runs Bar() at any time 'Do stuff here End SyncLock Return True End Function The comments explain why we SyncLock'ed. The documentation would say "This function is thread safe." It's not the best example in the world, but to make it more clear, I beleive that most code is over commented and under documented. :) J.Ja

Absolutely
Absolutely

[i]Then toss OO out the window... ... because with all of those black boxes, who knows what happens? For example, does the .Count property return a private variable that gets updated when the collections gains/loses elements? Or does it calculate it "on the fly"?[/i] With the deluxe find-matching-text implemented in Visual Studio, I don't hesitate to call the writing of code that doesn't describe such repercussions within each function/method as "unconscionable". This would require us to write our code more slowly in terms of lines of / hour to first draft, but more quickly in terms of hours to good application. [i][b]Undocumented[/b] side effects, while somewhat rare, are a huge concern for me in general. Heck, the biggest is the garbage collector. In essence, GC'ing is a side effect of something else, whether it be a lack of space on the heap, or the process exiting, but when it occurs is impossible for the GC'ed object to predict. As a result, certain situations can be quite dicey, *particularly* in parallel operations.[/i] I agree, the degree of compiler automation makes garbage collection an especially tough subject for a programmer in Visual Studio. You can't very reliably optimize a process you aren't allowed to see. Oh well.

Justin James
Justin James

Tony - I know exactly what you mean. My saving graces in this arena is that Scheme (a Lisp derivitive) was my 3rd language, and that I spent years in Perl, which takes a lot of cues from FP. Even with all of that, it is hard for me sometimes. When I was working with F#, it took me forever to re-learn the basic premise of FP, that there are no "variables" that store data, but function definitions only; a "literal" is still a function, just one that always returns the same value. An example: let x = 5; print x; Really means: x is a function that returns 5 print the result of the x function It is a complete 180 from what you are used to if you've spent 30 years working on OO and procedural code. J.Ja

Justin James
Justin James

Tony - If you want to try it out, I recommend F#, simply because it integrates right up with Visual Studio, runs on the .Net CLR (with all of the related benefits of that), and does not *require* using the .Net Framework, so you can write pure functional code without seeing Java-esque calls to that namespace. J.Ja

Tony Hopkinson
Tony Hopkinson

According To Mark M, it's the bit I need to go round, whereas I was trying to go through. Hmmm hard to see where you are going when you've worn a thirty year old rut in your head. I shall have to construct a ladder. :D Meanwhile I shall STFU.

Tony Hopkinson
Tony Hopkinson

It's pretty obvious the penny hasn't dropped for me here, but Hey it's not the first time. :D Studying isn't going to cut it for me, have to see if I can load up something that will let me use it, in my copious free time

Mark Miller
Mark Miller

I talked about this earlier, but check out "Functional Programming For the Rest of Us" at http://www.defmacro.org/ramblings/fp.html. Slava Akhmechet talks about what I've been describing here. Quoting from the article: "By now you are probably wondering how you could possibly write anything reasonably complicated in our newly created language. If every symbol is non-mutable we cannot change the state of anything! This isn't strictly true. When Alonzo [Church] was working on lambda calculus he wasn't interested in maintaining state over periods of time in order to modify it later. He was interested in performing operations on data (also commonly referred to as "calculating stuff"). However, it was proved that lambda calculus is equivalent to a Turing machine. It can do all the same things an imperative programming language can. How, then, can we achieve the same results? It turns out that functional programs can keep state, except they don't use variables to do it. They use functions instead. The state is kept in function parameters, on the stack. If you want to keep state for a while and every now and then modify it, you write a recursive function." Further down he says: "A functional program is ready for concurrency without any further modifications. You never have to worry about deadlocks and race conditions because you don't need to use locks! No piece of data in a functional program is modified twice by the same thread, let alone by two different threads. That means you can easily add threads without ever giving conventional problems that plague concurrency applications a second thought! If this is the case, why doesn't anybody use functional programs for highly concurrent applications? Well, it turns out that they do. Ericsson designed a functional language called Erlang for use in its highly tolerant and scalable telecommunication switches. Many others recognized the benefits provided by Erlang and started using it. We're talking about telecommunication and traffic control systems that are far more scalable and reliable than typical systems designed on Wall Street. Actually, Erlang systems are not scalable and reliable. Java systems are. Erlang systems are simply rock solid." Functional languages are ones that look and behave more like Lisp, Scheme, Haskell, SML, Erlang, etc. Of the ones I just mentioned, I don't know how many are purely functional. Lisp, for example, is not not purely functional, but it started out that way, and displays many of the necessary characteristics. It's not easy to grasp at all if all you're familiar with is imperative languages. You really have to sit down and study it, and work on the way you think about the activity of programming for a while before it starts getting clearer.

Justin James
Justin James

... because with all of those black boxes, who knows what happens? For example, does the .Count property return a private variable that gets updated when the collections gains/loses elements? Or does it calculate it "on the fly"? And if so, does that have any side effects in a special purpose item? Undocumented side effects, while somewhat rare, are a huge concern for me in general. Heck, the biggest is the garbage collector. In essence, GC'ing is a side effect of something else, whether it be a lack of space on the heap, or the process exiting, but when it occurs is impossible for the GC'ed object to predict. As a result, certain situations can be quite dicey, *particularly* in parallel operations. J.Ja

Tony Hopkinson
Tony Hopkinson

How do we guarantee no side effects ? I put my code through a fine tooth comb trying to get rid of the m'f***ers, sometimes it's not practical. Even if I am successful, how do I know the inetrpreter / compiler won't create some, how many are in the OS? No side effects is a mandatory requirement for implementation of parallelism.

Mark Miller
Mark Miller

One of the characteristics of a true functional language is that all variables are "const". There are also no side effects. Once you assign a value (a function) to a variable, that's it. This is what makes lazy evaluation possible in recursion, because functions can be executed in pieces, rather than sequentially. I think this makes optimizing recursion easier, because when you're evaluating the recursive call, all you have to do is look at where the variables involved were assigned (this happens only once in the routine), and get the values via. lazy evaluation. This could be done in a formulaic way by the interpreter. It could evaluate the tests, and prioritize what code it wants to execute. If it wants to optimize for recursion, it could seek to reach the recursive call at the earliest opportunity, skipping over other code. Once the recursive call is reached, evaluate the sub-functions (if any) used in the expression, and evaluate the variables involved (their value assignments) via. lazy evaluation and come up with the next iteration values immediately without executing the whole function, and only later execute the rest when the recursion unwinds. Where it would benefit from parallelism is if you have a situation like: B: ... C = A + B Where B is the recursive element, and A is something only loosely related to the value of B, and can be split off as a separate function. While B is recursing to its bootstrap point, iterations of A can be executed on other processor(s).

Tony Hopkinson
Tony Hopkinson

Call something, it causes a change in something else that has 'nothing' to do with it. I can see what you are trying to do, I even agree, I'm having difficulty seeing how it could be achieved though. Even if it was, I can see a lot of developers struggling with it, I'm not seeing productivity or performance benefits either. As far as I can see this is an attempt to hide the complexity in writing parallelised processes, given previous attempts to hide things beyond Let A = 100, I am not running around cheering. I refuse to accept that it removes it except possibly at the cookie cutting level which I don't work at anyway. It's a lovely idea from a purists point of view, but get back to me when the hardware and OS is in play. Meanwhile what can we do with what we've got?

Vladas Saulis
Vladas Saulis

I'm sorry Tony, I'm afraid you are looking at this model through the prism of current programming techniques. Many things in this model should be programmed differently, I think. Also, can you describe exactly what do you mean by 'side effect'? In this model everything is done by side-effect.

Tony Hopkinson
Tony Hopkinson

information in that piece of code for anyone to know whether it could be safely parallelised. I used a property in the argument to pass to the call, what if it's accessor contained a counter of how many times it was called? A regular and nasty side effect. Locking is the 'physical' side of the story, first you have to write 'self contained' code and police it. How can the assumption that the code won't go out of operational bounds be guaranteed. Can we afford an interpreter that assumes this is true. I can see theoretical ways of doing this,in practice it would be an unholy mess and as you pointed out quite possibly kill our current hardware and the OS. If this, then that will work OK great brill whoopeee, how do we make sure of this though? More to the point, do we wait for some high forehead to give us the hardware first? Or do we look at what we can achieve now. I much happier with documenting intent, because that is achievable, it can be validated and as the devloper I will be told that my intention and the actuality don't match.

Vladas Saulis
Vladas Saulis

@Mark: How do we resolve dead-lock issues in database programming? There are two common ways: 1. We handle it programmatically (not allowing it to happen by the design). 2. By the use of spin locks. My model gives one more opportunity - an external injections of Objects. If system finds out there is a dead-lock somewhere, it hypothetically have a possibility to resolve it by injecting an appropriate method call into queue. On the other hand, most systems now are interactive, so even an interacting person himself can inject an Object to resolve dead-locks (it may be a part of existing program logic, or one, created on the fly). BTW, the use of dynamic languages (not necessarily JavaScript) is essential in my model.

Vladas Saulis
Vladas Saulis

Justin, You've understand the idea almost correctly. There is a disconnect in task processing. It's like multiuser database access, with the diference that all "users" are same task's objects! And other "users" can be external tasks' injections. Yes, and this reuires a brand new OS. And modified language compilers. But we don't need any new languages for this to implement. It's still possible to simulate this inside existing OSes. I'm thinking of proving a concept by use of XEN virtual machine with a bunch of HTTP servers working together (one for queue, others - for OPUs). Servers may be easily assigned to different real CPUs inside XEN. @Tony: Compiler does his decisions, based only on the 'hints' provided by programmer (.OPU property for any Object). If there are no hints - program runs sequentially (as a big chunk on one CPU, or parallelised in default manner if compiler is smart). Variables are not passed back through the queue, as someone mentioned. These are stored [i]directly[/i] into persistent task area by each Object. Just like in database applications. It's probably difficult to understand my model, because it has no strict execution path, no stacks, no execution point at low level. It is scattered task. The programming model is more like data-flow, though it can be implemented in imperative manner. The data-which-flows in this context is an object stream flow, and it incorporates methods' code as well. This gives more flexibility to data. If you get more deep into concept, you can see that parallelizing can be done on any abstraction level, and it is recurrent (i.e. can be propagated from any parallelized part too). In common, the model is independent of language. This can be done at any place in the hipothetic system between any system parts (GPUs, CPUs, IOs). So it would be better to implement a complete OS based on that.

Mark Miller
Mark Miller

From my perspective he's not even trying to make a model that would work for .Net or Java. He's just proposing a system model that would at a low level make concurrency easier to implement. In order to get what he's talking about you have to think in terms of how an interpreter or compiler would deal with situations. What he does in his example is inline the loop. Rather than iterating over a code block to create the generated values, the interpreter says, "Okay, the programmer is indexing over a range. I'm going to inline that range." Instead of executing "init value; execute operation, increment value, execute operation; etc...". The interpreter generates these instructions and puts them into the queue: BigOperation(b[0],c[0]) BigOperation(b[1],c[1]) ... OPU1 can grab BigOperation(b[0],c[0]) and process it. OPU2 can grab BigOperation(b[1],c[1]) and process it, or vice-versa. It would do something similar with your loop: Queue ----- item_0 = MyItems[0] (code for MyClass.MyProperty goes here to be executed) temp_0 = MyClass.MyProperty (result value) item_0.DoSomething(temp_0) (operations for DoSomething() go here, or somewhere down the line) item_0.DoSomethingElse(temp_0) (operations for DoSomethingElse() go here, or somewhere else down the line) item_1 = MyItems[1] item_1.DoSomething(temp_0) (operations for DoSomething go here) item_1.DoSomethingElse(temp_0) (operations for DoSomethingElse go here) ... This is a statically analytical way of looking at it (not very realistic). It would look more interlaced, jumbled up, but I see what he's getting at. The database/queue acts as a transaction-safe data store for data and code, so that the coder and the OPUs don't have to worry about locking data values. Locking does occur, but only on the queue. Everything can be assumed to be safe to entities outside the queue. What he doesn't make clear is how values, once they're calculated find their way back to the variables they're supposed to be assigned to. He just says the values are put back in the queue. He doesn't address how to handle side effects. If side effects are eliminated in his system, he doesn't explain how that's accomplished at the language level. What he also doesn't make clear is how the system would handle potential deadlock situations. The OPU system reminds me of what Steve Yegge talked about in a blog post I referred to earlier. He said John von Neumann was working on a cellular processor system before he died, which in Yegge's opinion would've handled concurrency much better than the single processor model we have now.

Justin James
Justin James

Tony - I think the disconnect (and I think Chad saw it too) is that the idea is expressed in terms of JavaScript (as a pseudocode syntax) and that it discusses a lot of current techniques and such. Once I looked past that, and looked directly at the "OPU" idea (my first mental comparison was ZFS, for some reason), what I saw was an interesting model which really cannot be expressed well with current OS threading models, and possibly not with current hardware models either. J.Ja

Tony Hopkinson
Tony Hopkinson

It's hard to know where to start. Can this be parallised, and what can it be run concurrently with foreach(MyItem item in MyItems) { item.DoSomething(MyClass.MyProperty) item.DoSomethingElse(MyClass.MyProperty) } What serial dependancies are there. Which ones did the coder intend. Which are implicit ? Fobbing this off to the syntax and semantics of a language with no explicit intent, means one of Cross your fingers and hope the interpreter understood 'you'. Extending the language to the point where you've added intent but obfuscated the reasoning behind it Or crippling what you can do in the language. And when all is said and done, even if it could be done, the underlying code generated by the interpreter will still lock!, what was that written in ? All you are suggesting is concurrent cookie cutting!

Vladas Saulis
Vladas Saulis

Routines doesn't finish or return. These just stop and extinguish on execution end. There are no return points from any of Object methods (return statements are only syntactic sugar, which provide a link between Object and Task areas). All variables are updated directly within a persistent task area (or task database). So, there is no locking in legacy meaning. There are transactions and atomic access instead. Atomicity must be implemented and provided on interpreter level.

Tony Hopkinson
Tony Hopkinson

It's a client server pattern. Interpreter = Client, OS = Server. Without locking either explicitly in code or perhaps implicitly via the OS, which of course is code. How do you make Routine = finished and Complete = true atomic ?

Justin James
Justin James

I wrote my response before I read yours. :) J.Ja

Absolutely
Absolutely

[i]I figure you're more likely to find interested students there, though you have to pick private schools carefully as well. From what I hear they can have students that feel entitled, which I imagine is just as bad as students that aren't interested in learning. Anyway, teacher beware. What I was getting at was whether the parents, and therefor the students, are more concerned about them really learning something as opposed to just getting a good grade.[/i] The latter looks to me like one of many variations of the former, and a far less malignant behavior pattern than the various "epidemics" the press reports about public schools, [i]ad nauseum[/i]. Pursuing grades over "really learning" might shortchange the individual student who does so, but doesn't interfere with others' ability to set their own academic priorities.

Mark Miller
Mark Miller

What I was getting at was whether the parents, and therefor the students, are more concerned about them really learning something as opposed to just getting a good grade. The way I suspect most Americans view life, at least, is that it's a game, which is pretty accurate. If the school environment, however, is dominated by the idea of "gaming the system", that's what the students are going to learn, not so much what they're really there to learn. Learning can be a game, when learning is the goal, but if the way learning is measured is viewed as a game then they're really missing the point.

Absolutely
Absolutely

[i]I figure you're more likely to find interested students there, though you have to pick private schools carefully as well. From what I hear they can have students that feel entitled...[/i] I wonder if this is more common in private schools, or just more noticeable due to the absence or comparative rareness of more serious problems.

Mark Miller
Mark Miller

Re: Sued? I could be wrong on that point. I don't know how it works, whether the school gets sued or if the teacher is open to liability. I mean I guess if you follow school policy you're okay. It would just be a question of whether the policy agrees with you or not. Re: prefer private school I figure you're more likely to find interested students there, though you have to pick private schools carefully as well. From what I hear they can have students that feel entitled, which I imagine is just as bad as students that aren't interested in learning. Anyway, teacher beware. :)

apotheon
apotheon

"[i]Would I be able to send disruptive kids to the principal's office or would I get sued/fired for doing that?[/i]" Maybe I should work for a private school. I'm not a huge fan of public schools anyway. Fired: So what? I'll just go back to doing what I do now for a living. Sued: That, I'm afraid, is a bit more frightening.

Justin James
Justin James

I would actually love to be a professor... but my academic credentials would not get me into the hallowed halls of most CS departments (I double majored in liberal arts, and only have a BA). And considering the way CS gets taught at most places, I would be fairly miserable, I suspect. Actually, my ideal course to teach would be called "an hour with Justin". Each class, I bring a different guest in, and we chit chat about whatever. The TAs take notes about the relevant facts that come up. And then you get a test like: 1) Name some of the pros and cons of the Chevy 350 vs. the Ford 302. 2) Is OOP evil or just abused? 3) Metallica or Megadeth? 4) 6 degrees of seperation involving the movie "Predator" and politics. 5) Is Russian literature pure genius, or just plain biring? :) J.Ja

Mark Miller
Mark Miller

I've occasionally had those thoughts as well though I never take it seriously. There are other considerations as well. How flexible would the curriculum be? Do they mandate teaching a certain form of programming, certain languages? I know for example they typically teach the AP CS course in Java, because that's what's on the test. This gets into a whole other topic that would not be kosher here, but what about classroom discipline? Would I be able to send disruptive kids to the principal's office or would I get sued/fired for doing that? It seems like these days you can't be sure. That kind of stuff runs through my head sometimes, too.

apotheon
apotheon

I've considered getting a teaching credential and catching them young -- teaching programming in high school or junior high. Part of the reason I'd want to do that is to teach kids to think flexibly and logically before our effed-up society grinds the potential out of them.

Editor's Picks