As we’ve discussed previously, the creaky code base in large banks is a systemic risk. Major banks run their transactions on mainframes, and significant portions of the software is both ancient and customized. Since at least the mid 1990s, these banks have had major projects to migrate off their legacy code. Although it is hard to prove a negative, I am highly confident no one has succeeded. Why? It would be such a spectacular accomplishment (and would still be very costly) that any bank that had succeeded would have broadcast its accomplishment.
A recent article in Medium, Interviewing my mother, a mainframe COBOL programmer, gives a sense of why the problem is so intractable. I’ve excerpted key sections and encourage you to read it in full (the other parts of the article have detail on database sizes and organization which experts will find informative). The mother works at a bank now known as Nordea, which has about $700 billion in assets. Contrast that with JP Morgan, which has roughly $2.4 trillion in assets, and massive derivatives clearing operation on top of that.
From the article:
This position is the most important one in the bank, at least from a technical standpoint. If, let’s say, my mother and everyone on her team would quit their job, the bank would go under within a matter of weeks if they’re lucky. They have a rotation of people on her team being available 24/7. I remember when I was younger and she had to take a taxi to work in the middle of the night on a Sunday to fix a dead-lock problem.
…is not a fancy programming language like your functional Haskell or concurrent Golang— it’s an imperative, procedural language and since 2002, object-oriented. There’s nothing wrong with the language itself, the problem is that barely anyone knows it — at least not in the context of mainframe programming. My mother is the next youngest person on her team, and she’s born 1964, and the youngest person being 2 years younger. Since almost all of the largest banks in the world runs on IBM Mainframe with COBOL as the primary programming language, this is a global issue. The smaller banks however are better off which usually runs something like Java without mainframes….
Banking systems are also extremely advanced. A personal bank account differs a lot from a business bank account, and there are at least 50 different types of bank accounts for each of them. And in Nordeas case, they also have the Swedish government accounts, which are different from both personal and business accounts. I think they have the Finnish government accounts and maybe a portion of Denmarks as well, which differs too.
Clive provided more corroborating information via e-mail:
Did you read what the most highlighted section in the piece had become:
There are programs that are decades old that nobody even knows what they do and the person who wrote it is long gone.
At my TBTF, I heard only last week there’s a routine in COBOL which is something to do with payments. It was written in the late ’60s. What it does is fairly easily discernible from the code, but what is a complete unknown is what upstream or downstream dependencies there might be for this module. It may be being invoked directly. Or its output may be being checked. Or it could have become an elaborate subroutine for another process. No-one really knows. So there it stays, untouched, undecommissionable.
If it was a lower priority system it might be possible to experiment a bit and remove it, see what happens. But if it brought down payments or, worse, didn’t bring down payment but corrupted the processing of payments in some subtle but damaging way, the losses could run into hundreds of millions.
If there was a lot more money available, we could suspend the module in dev and see what happens. But We’ve only got three or four “proper” feature-complete dev regions and these are all permanently tied up with other, urgent, projects/changes/fixes. Therefore you can’t tie them up for weeks (or more likely months) doing month-end, quarter-end and year-end simulations. We’ve got a fair few more other development systems, but these are not complete with all the interfaces, so are only suitable for unit testing which means you can’t do the sort of end-to-end testing needed to really get under the hood of this routine.
Simplest, then, to leave it sitting there. Like one of the monoliths in 2001, a beguiling mystery.
A couple of years ago, there was a big programme started to decommission a load of “legacy” systems. No-one can think of any system which has actually been decommissioned.
At the C-suite level, the drive for bodyshopification continues unrelentingly — take experienced, knowledgeable but expensive subject matter experts, shovel them out and get in someone (maybe two or even three different people, it doesn’t seem to matter) at 1/3rd the cost in India, the Philippines, Poland, it really doesn’t bother anyone at all where, instead. All they need to do is have a months’ handover and get a link to a SharePoint site with the “documentation” and off they go. The churn is immense, these “resources” are rarely there for more than a year, and never there for more than two. Oh, and every so often, the whole outsourcer gets the heave-ho to be replaced by another outsourcer who can cut the day rate by a few pounds here or there.
No, it doesn’t feel like it will end well. But the plates keep spinning in the air for now, so nothing is going to change in the meantime…
On one of our earlier discussions of bank IT risk, a knowledgeable reader said (and I wish I could locate the comment) that the reason banks haven’t migrated off the legacy code is that the idea stops dead when management realizes that it will take three years of their profits to get it done. And that’s before you get to the fact that massive cost overruns in IT projects are routine, and 80% of large IT projects fail.
It’s inconceivable that regulators are not aware of this ticking time bomb, and yet are not forcing banks to act. When this all blows up is anyone’s guess, but a train wreck is inevitable, if nothing else due to the members of the mission-critical teams that baby this ancient code retiring and dying.
I think the risks related to the legacy languages and operating systems are exaggerated. It is true that most of those older systems are a mess internally and very hard to penetrate, let alone make it better over time.
One of the most important tenet of these systems, however, is their stability; despite occasional glitches they run OK and has been doing so for decades. That is, BTW, the reason why the IT management never bothered to change these systems as they obviously work.
Of course, it is more and more expensive to operate these system as the hardware becomes ridiculously expensive over time and it is harder to find people to operate those. But, the fact is this has been so, for quite some time in fact and apparently the costs and risks of replacement does not warrant an upgrade. However, the businesses change and some newer systems interconnected with the older ones gradually and naturally eats away the critical parts of the older systems
I had a friend who was one of the last two people who were able to make modifications to IBM mainframe programs written in PL/1 and ADABAS database systems; Both of which are passe, and so for decades. Those systems were mission critical for the organisation. Yet, after the retirement of my friend, who was sure he would be called as consultant with a fat rate, did not disrupt anything. The remaining person, even though much less knowledgeable than my friend, was able to continue to run the system; another person assigned as a backup and started training. Actually, I recently heard that guy have also retired (I no longer work for that organisation), of course the system continued to function.
I find this practice completely rational and the best way forward. The mainframe systems that run the mission critical analysis has been replaced by newer systems gradually (a decades long process) not by replacement of functional equivalents but by other systems that connect to them which gradually takes away business data and logic from those older ones. I am sure after some 30 years or so, most of the mainframe systems running ancient languages and operating systems will totally disappear without any fuss.
Yes, the system I mentioned in the above piece has a spectacular uptime and reliability record. It’s something approaching 6 sigma, it never fails. What does fail, all the time, are the systems which interface to it. Without being able to give away too much detail, a major customer-facing system suffered a half-day outage because it had entered a race condition following a change. What hadn’t been appreciated by the team making the change to the customer-facing system was that there were two near-identical interfaces to the payment system. No-one knew why this had been implemented, but there it was. The post-incident review concluded that a newer interface had been built to add some additional functionality and expose it to downstream systems rather than updating the original interface. One of the developers on the customer-facing system called the older routine, another one called the newer routine. Both were in the documentation and both were still live.
So rather than as you suggest newer systems eating away at the legacy ones, the reverse happens. The legacy systems get ever more baroque to support the demands (Chip and PIN, Faster Payments, ApplePay, Mobile…) of the newer systems which the marketing people demand.
This is because it is too costly and too risky to rip out the legacy systems and start again with a greenfield.
Agreed you can go in many banks and observe that the apps used by the reps are screen scrapers that take the CICS output and reformat it for display purposes, just as (although not as visible) most online apps are front ends to the basic account cobol programs. (you do see some apps at the bank that are direct CICS screens thru 3270 emulators also).
I have spent quite some time thinking about this, for various reason, and believe that the only real solution is what I call “Project Phoenix”. Simplified, your large bank sets up a small, purportedly low-cost bank, which builds everything greenfield, and runs for a few years. Once its bedded in, TLB moves all its customers to the small bank.
This still runs a number of problems, not the least of them being political, so don’t expect to see it.
On the other hand, a number of fintech startups (the few and between feasible ones) compete on lower costs due not having to maintain creaking masses if infra. Unless the larger banks manage the move, they are toast even on what still are reasonably profitable businesses.
Oh yes, another problem with the mainframes – as often as not you can’t find replacement for the failing HW that comes from 1980s/1990s anymore.
Have never heard of any problems replacing mainframe hardware. State of the art processors are available from Hitachi and DASD is updated also. Where did you get this information?
You may be right that fintechs will eat the not-so-TBTF banks’ lunch, but consider the following scenario: These fancy fintechs will build everything greenfield – fine, but by whom ? By some offshore development center delivering code that no one can figure out even 5 years from now ? Who is to say that the fintechs are better than the old banks at managing the technology risks they incur while building and maintaining their fancy new systems ? Given how old those systems at traditional banks are they have been doing pretty well I would say.
And, just this morning there was news about Comdirect, an online bank in Germany, screwing up their systems such that users were logged on to accounts other than their own for a few hours (at least the users couldn’t transact from there). Expect to see more of that as well, not just old-style bank IT systems failures.
I’m not saying that greenfield development will solve everything – indeed, it can mess things up spectacularly.
On the other hand, your solution space is – try to figure out how to fix the legacy systems and disentangle them (where interfaces between systems are often not very well understood, side effects are common etc. etc.), try to do in-place replacement (which often doesn’t work, because replacing all at the same time is too expensive, and doing anything else runs into the interfaces problems), or a real greenfield.
So it’s definitely not a silver bullet (and a number of fintechs I know produce crappier systems than a lot of banks does), but it gives you a chance to get things “righter”. Then rinse and repeat in 15-20 years, and again, you can move a bit closer to a better solution (which is why I call in Phoenix)
The main advantage to Vlade’s approach is that the new work gets done under new management who are less likely to get cold feet because it will cut into existing profits. Other than that the corrosive human resources bodyshop mindset is likely to reduce the chances of success for the new venture. These managers are all living in Plato’s cave and incapable of accepting reality. We’re literally at the end of our own “golden age” here, in complete denial that the Spartans are on their way to shut us down.
i guess you dont keep up, the but the latest mainframe was just released last year. course maybe you missed the big announcements. and its not like hw hasnt been updated before that. that seems to happen about every 2 years. and they have little to do with the 80/90s version. other than they can run the same programs
always wondered why some seem to think you need to buy the latest HW and SW, just because its new. in an industry that is very fadist, in that we always have the greatest new tool, that some times last a few months, if that, for example care to guess what the latest greatest fad was in the 80s? ever heard of code generators? or what the latest new language was back then?
Vlade: have you seen IBM’s shiny new z13 mainframes? It’s the size of two refrigerators, and can kick the living snot out of mainframes of the past. can run thousands of Linux instances and legacy mainframe worklads at the same time…
This is primarily a software problem but there hardware implications. Yes, you can buy new hardware. But the test pack which you have to run when you attempt the migration is immense and has the risk that it doesn’t know what it doesn’t know. By which I mean, if you have spaghetti systems, it is impossible to know for sure you are testing the right things in planning your hardware migration event. Oh, and it means a change freeze, which has big impacts to meeting regulatory change timescales (you can’t tell the Bank of England — not easily anyway — “oh, sorry, we can’t do that Brexit thing this year, we’re planning a hardware upgrade…”, for example).
You end up making assumptions and taking a few chances here and there. But these are not systems which can stand multi hour outages while you try to get everything rolled back. And even if you do roll back your hardware upgrade change, you’re simply back where you started — on old hardware trying to figure out what went wrong and how to prevent it happening when you try again.
Methinks you doth protest too much. This is old hat to many in regulated industries where redundant infrasturucres are mandated, tested and verified by many eyes-including the overseers.
No-one is mandating anything. It is the legacy estate and is a fact for most, if not all, large financial institutions. You can’t wish it away, you have to manage it. You don’t do that, successfully, by channeling your inner Pollyanna.
Sadly, the vast majority of bank CEOs and regulators think exactly as you do. The all-too-frequent outages which result are a cost to their customers and a tax on the rest of the economy.
And we do manage our estate very well, thank you very much. We exceed our SLA’s consistently, and continue to introduce new functionality as fast as we can with high quality, which has allowed us to grow organically without having to result to financial chicaincery. At one point we looked at offshoring our production support, but when we looked at the numbers and the risk involved it was felt it was best to keep that function in house. Not all places see the bottom line as the be-all and end all that you cynically appear to have taken to heart. That’s also why we have actually started our own in-house training programs, too.
My experience in a number of different FSI IT shops is that the mainframe environments are the most well managed and stable of all technologies. Yes, the code is a mess and very old, but the entire front to back estates are managed very well, on what has become repeatable upgrade cycles that are executed with precision.
As Clive mentions above, the problems are usually outside of the mainframe estate where client facing applications are running.
And 20 years ago Tech Management was worried about the lack of COBOL developers long term, and frankly it has not been an issue that I have seen manifest itself. I have seen significant investment in building and developing these skills for the long term in multiple geographies, and retention has always been a high priority in these areas.
Banking (at large banks and securities firms) has at least an order of magnitude more complexity and transaction activity than your industry, and I’d bet more like two or three. Generalizing from outside banking to banking is proof you don’t know what you don’t know, and are loud and proud about it too.
There is a major feature of MVS (or whatever its deriviative is called today Z/OS) in that TCP/IP is a pure add on subsystem, not involved in the OS kernal like Unix or Windows. Todays mainframes can run multiple instances of the legacy OSes as well as linux (and depending on the hardware you can buy add ons that run one or the other).
Of course the problems with software date back to the 1960s when IBM wrote the original software for mainframes as documented in the “mythical man month”, that throwing more folks at a problem just makes it take longer.
its not like you dont have issues when you update a non legacy system is there? cause just changing say the driver in a distributed application can break it. never mind say upgrading operating system (seems that almost no large companies ever adopted Windows 8, or Vista, mainly cause there just wasnt a need to or the risks of breaking the systems). and thats with non legacy code. and the testing facilities for the applications (not unit test) isnt any better, and could actually be worse. and upgrading hw in a non legacy world is not better, new HW can also break you application.
in short upgades to OS and HW arent less hazardous in non legacy world as in the legacy world. legacy wold just tends to have a much higher requirement to be available
The dependencies remind me of evolutionary genetics, where every code is tied to everything else and experts only have the vaguest notion of what’s going on. (If only Mother Nature had written elegant procedures in Pascal…)
One of the very few benefits of antique code is that it often requires archeological expertise to hack it. Security via obscurity, it’s called.
Of course, if disgruntled ex-contractors in India are the only ones who know how your system really works, how secure are you?
Great target for public utility investment in the coming Second Great Depression…
It all depends on how well-factored the code is. You seem to be describing a specific anti-pattern referred to as spaghetti, and bad code is ultimately a product of poor management, often stemming from a failure to understand the hidden costs incurred in software development.
There’s nothing absolutely wrong with old code per se as long as it is readable and well-structured and is maintained regularly.
Also, very few programmers in history have ever liked code written by other programmers. This seeming law of the universe has been responsible for an incredible amount of damage over the years.
You are being disingenous. The issue is lack of documentation, which is endemic in banking. The development teams forevah have been under pressure to get the code into production. Documenting their work saves $ over the life of the codebase, but adds 20% to upfront cost and more important, to dev time, and that’s deemed unacceptable. I’ve seen this first hand at players that were recognized as having the best IT management in finance in their day and devoted tons of $ to it (literally half the firm’s revenues went to IT).
if i could be ingenuous (sic), isn’t that the problem in a nutshell: we clever nekkid apes have flaked flint into complex systems beyond our control, based on fractions of rationales, and miniscule (spel czech sux) morality…
they have literally grown beyond our immediate control, and have ushered in an era of bizarre and byzantine financial instruments, with derivatives of derivatives of derivatives, until everyone is betting on everything and nothing can fail…
in other words, cloud cuckooland…
um, shouldn’t we stick to stuff we kinda/sorta unnerstan ? like bushels of wheat, pork bellies, and company stock ?
Is there an economic term for those workers who’s skills are so integral to maintaining an enterprise that to “remove” these workers would destroy said enterprise. I always thought some Tyler Durden-esque evil genius with some serious capital could simply pay certain people to “walk-away”. Could these high-margin productivity workers (alpha-knowledge workers) be “compelled” to work? Give them each 5 million dollars to quit. What happens then? Extrapolate from Cobol programmers to Cloud Infrastructure Engineers (ie at Amazon, Apple, Google, Rackspace, etc…) I would bet these IT engineers max out at $150-250K. How much would it cost to have them voluntarily disappear and never return to work. 2 years salary? 5-10 years salary? Just a thinking….
Alts Gulch for the coding creators who carry the world? Atlas chugged.
Er, want to sell the movie rights?
But if they all walked away for enough money, and all the money-based and money-mediated systems crashed; what could they do with their money in the absence of civilizational survival systems to spend their money in? What if they were to all think about that and all decline the offer of enough millions of dollars to walk away . . . . because they all feared having all their money crushed under the falling house of iron cards?
What would they do for food if all the money died and went to money heaven . . . if they withdrew their support from the systems which make money function? Would they fight over acorns with squirrels in the park? Turn over boards and eat the earthworms they find there? If they think about that, they may decline your kind offer.
What form might a train wreck take in this context? Suddenly a TBTF bank goes “offline”?
Also, this caught my eye:
“Flexible” offices are a new trend here in Sweden and I have yet to meet anyone subjected to it who thinks it’s anything other than awful.
Examples of that are breakdown in payments systems. Here in the UK, Natwest (a bank owned by RBS) was plagued by payments problems in the last few years (last one as recently as Jan this year). A friend of mine used to do their business banking, while he was in that role, I pretty much wasn’t able to get hold of him, as he was always firefighting.
For an example of what can go wring, check the articles on The Register when RBS had a big fail in it’s IT systems (2012). Their ATMs were off-line for quite some time. And this was partly caused by firing the internal experts and hiring cheap people from India. They got a fine for this as well.
Flexible offices are just a way to save money. In The Netherlands they’re also popular, but given the number of people who work part-time, I can see the point. I prefer working from home myself.
Ricardo Semler talks about them in Maverick and/or The Seven-Day Weekend. They use these successfully in one or more of the companies that make up Semco Brazil. However–and this is key–such developments come from the bottom up, the workers themselves. If they want a flexible office it’s because it suits them, either because of a dramatic reduction in commuting costs and time, or other reasons.
Much ado about COBOL.
Technically, the systemic risk is better blamed on DASD, ISAM and legacy queuing artifacts whose misbehavior is fully priced in.
Could add CICS, VSAM, IDAM, FORTRAN, IMS…….
I am surprised the referred article did not mention REXX.
REXX is drop-dead elegant and simple-what’s real interesting is the amount of M being used as part of GT.M applications by FIS….
I hate this COBOL beating. It is not COBOL that is a “risk”. Some of the legacy system are though. But that has nothing to do with the language they are written in.
COBOL is easy to learn (and to teach.) Way easier than C++ or some “modern” languages. If one need COBOL programmers one can easily set up classes and create some. There is no secret in how to do that.
The language is very easy to read. It is actually difficult to write cryptic code in COBOL while cryptic writing is nearly a necessity in C, Java and the like.
We all should be very happy that these old systems are written in COBOL. That gives at least a chance to understand them and to correct them when necessary. That would be impossible if they were written in any more system or math aligned language.
To find out who/what calls a certain routine, as in the example above, is not a miracle. There are standard tools available to detect dependencies in COBOL as well as in pother languages.
The real problem is short term management thinking. Long living IT systems demand long term investment. In people as well as in the software. Ask for a five year project to rewrite a 30 year year lT system and you will get kicked out of the managers office. Instead some short term “overlay” fix will be applied, implemented undocumented ina cryptic format by some geek who will be off to nowhere soon after it.
Those fixes are much more the problem than the original base.
The regulators need indeed to wake up. But they are under political oversight which is just a short term oriented as the bank management.
It’s easy to write structured code in any language provided you know how to do it and the value of the process. I can write COBOL that would twist your eyes and give you a screaming headache.
It and RPG II were the first languages I learned (and I HATE computers, have since forever. Why won’t you do what I want rather than what I tell you?).
“Instead some short term “overlay” fix will be applied, implemented undocumented ina cryptic format by some geek who will be off to nowhere soon after it.”
This hits the nail on the head. No one wants to re-write legacy code; but little ‘fixes’, hey, they are easy, they are everywhere and they are rarely tracked, let alone documented. As well, management, and the people supporting the code, often have no clue which of the little ‘fixes’ are now vital to the system because the little ‘fixes’ themselves are now 2nd or 3rd generation in a 10th generation of little fixes. And as for COBOL being not well known, fixes could be in APL, RPG, BASICA, Rexx, VBasic, or a host of other languages that have fallen by the wayside. All this makes even attempting to re-write legacy code almost impossible. @vlade’s approach of a fresh start, from the ground up, with a smaller bank is probably the sanest approach but experience says, where money is involved, no one is sane.
COBOL is easy to learn. I learned BAL in a one-week Assembler Language Coding class at IBM. But I learned COBOL while I was eating lunch one day in my car at an A&W Root Beer stand. It was a snap. I have ventured to try to learn C and the rest but it is just too depressing. What I see makes me shudder to understand how difficult it must be to develop systems. And the job security that people say COBOL creates for ancient programmers, is nothing compared to programming languages today.
In my working life as a programmer 1965-1995 I worked for companies and had my own company that took over the data processing operations of large enterprises. Most of these had COBOL mainframe systems. We needed to know the connections between modules during a cycle and over time. So, we wrote a program to analyze the code. It took a little refining but at the end we had a solid tool. Likewise with reporting. One of the reasons that we were asked to come to a company was that the managers could not get the data they needed. They would show us bits of what they wanted that were already created by their systems, but not in the structure they wanted. So, we just wrote a tool that spooled printer output to tape and then processed the tape to build a data base for reporting. This gave the managers a lot of the data they wanted in a more usable format, and it gave us time to develop a full-blown new system of reporting.
Another thing about those early days in systems is the use of sequential (tape) files. The mental gyrations one would go through in order to have the data one needed at the time it was needed and in the sequence one needed were immense. One learned to be very efficient in the organization of data files. When we transferred those files to disks later we were very efficient and our systems were very fast. But when data base packages came along we found that transferring our files to these general tools slowed us down in many situations. As a result we maintained the sequential files on disk and built our own keys. In some cases, such as a telephone company, we created customer master files that were logical not physical. There was one file per customer–the operating system was the index. It was fun, it was fast and efficient, and we might never have done it without cutting our teeth on tapes.
The problem in bank IT isn’t to do with the simplicity, or otherwise, of the COBOL code. As I said in the piece above, the module we found in my TBTF’s payments system isn’t complicated and it is easy enough to understand what it is doing. What isn’t known is why it was created in the first place and whether it is still needed. The payments system was originally written in COBOL and ran as a stand-alone application. But over time the payments system has ended up becoming both an upstream and downstream dependency for other systems (Faster Payments (STP), Online Banking, Securitisation, Card Acquisition, ApplePay, dozens and dozens of others).
So the issue is whether the mystery module is needed and if it is needed, what dependent system or systems are relying on it.
A better understanding of the COBOL in question and how it relates to other payment systems COBOL routines isn’t going to fix this problem. What is needed — and what isn’t available because it was a) never documented and b) if it was understood by someone or a group of people then that person or those people have left — are the details of which other systems have a dependency on this module. Or else a confirmation that it has become an orphan and can be safely removed from the baseline.
The only method of getting that understanding is complex testing in a feature-complete environment. This costs money. The expenditure of that money would create no benefit, in the short term, for the TBTF. The regulators do not understand the risks of this pollution of a critical codebase or if they do, then they are being talked out of any enforcement measure by the management.
And as I said, it is possible to write a program that will look at the cobol source code and at the data files, including the fields referenced, and produce a result that maps out the connections you need. We did it all the time for years. Admittedly, the problem got a little harder with the introduction of ISAM files and the like. By harder I mean it took more work to do the job but the task was the same. It sounds like grunt work and it is, but it pays off. After you have done your first such mapping program the next ones become easier. After two or three of them we were able to produce a process including prototype programs that sped things up. But that is why our customers hired us, they wanted somebody to do the hard work.
Fine, but how do you test it? You’re trying to solve a problem that is code-related with more, different, coding. That’s a legitimate approach (automate the task you need to complete) but at some point you have to stop theorising and commit to an implementation — to decommission the now defunct module because you’ve ascertained that it is no longer used. So you’re pinning your hopes on a new piece of software you’ve just written to determine if an old piece of software is or isn’t referenced anywhere else. What does your testing look like for the thing you’ve created to do the mapping ? You can run your mapping again and again but there’s nothing you can do to guarantee there are no bugs or fundamental design flaws in it.
You’ve just landed on the big snake and are now back to where you started: either taking a risk and decommissioning the suspected-obsolete module or playing it safe and keeping it there, just in case.
You have improved your view of the system (or, you’ve allowed the team who ran your mapping simulator to convince you that you’ve improved it). So you’ve reduced the risks somewhat. But you’ve not removed them completely. Unless you can remove them completely — and you’ve not explained how you have done this — you’ve not changed the dynamic at all. You’re still back to pressing the big red button in Live and keeping your fingers crossed or, more likely in big organisations’ cultures, doing nothing and saying IBGYBG.
You test it by using it. I thought we were talking about how to deal with the mysteries inherent in COBOL legacy systems. We faced this problem constantly for years and we found that the only way to deal with undocumented systems and with the unavailability of access to the original developers was to develop the documentation ourselves with a focus on what we needed to know in order to accomplish two main goals: improve the data that was available to users, and to buy time while we developed new systems. We got called in for two main reasons: new system requirements such as a Blue Cross Plan which was about to become the Medicare administrator for an entire state, or a phone company which was signing up new customers at an amazing rate but could not service them. In both cases the managers of the enterprises knew that something had to be done and fast. We were called in and followed the process I have described. I have done what you say I have not done, I have explained how we did it. But in order for anyone to understand this process one has to park the car and think about it.
We had a two-track process. One was to provide new features that were necessary to keep the current system alive. This also gave us information we needed to make sure that we understood, at the coding level, just how the existing systems worked. This fed into the second track: the design, development, implementation of a new system
All I know is that the process worked. The customers were large, they had big problems, they had bigger needs, and at the end of the day their enterprises were transformed for the better.
> You test it by using it.
I think you guys are talking past each other because you’re really in different businesses, even though you are both in IT (and what, at the end of the day, is not IT).
Clive can’t test a system by using it, because the risk of an outage is simply too high.
The system I’m talking about is the one to find all the interconnections between programs and data in COBOL legacy systems. This program has nothing to do with the production of the host enterprise, rather it is a tool to aid programmers who are being asked to maintain an existing production system. Clive, or somebody upstream, was pointing out that legacy systems are not documented so it is very difficult to learn the details of how the code works. I agreed with that observation and explained, as best I could in this limited space, that we had developed a tool that would enable us to map which programs accessed which data fields so we could learn how each field was used. The idea is not difficult to understand and the programming, while tedious, is not difficult.
Grunt work solved a lot problems in my day. We did not look for solutions off the shelf because there were none. But today it seems that if a ready-made solution is not available then people throw up their hands.
I don’t recall that Clive advocated a ready-made solution. I’m glad your tools work in your space.
That may be, but that is not what I saw in his comments. Just as he says that my approach is ultimately fatal or too expensive I say that he is hoping for some tool that will eliminate risk and cost.
There are no such ready-made tools in dealing with COBOL legacy code, but by doing the hard, tedious work, one can still move forward and update the system. He laments the difficulty of the problem and I agree, it is a hard problem but it can be overcome–by doing the work.
We used our tool and it worked. We saw modules in the existing systems that we did not understand and we did not know if they were essential. So we ran tests to find out and after a while we were able to keep the module or delete it.
Finally when Clive, or anyone, digs in their heels and does not want to do the work because of difficulty or cost, then what are they hoping for? Divine intervention or a ready-made solution?
We made our living doing the hard work that others could not or would not.
Sure, you can have the last word. I’m sure readers are canny enough to see that you’re being non-responsive, both to Clive’s comment and my own. I love the amour propre displayed in the last line. Best of luck to you.
I was lucky enough to attend a small lecture by Admiral Grace Hopper, one of the team who created COBOL. It was an interesting experience. She handed out souvenir nanoseconds, pieces of wire cut to a length that electric current would traverse in one nanosecond. Anyway, that was in 1977 or 1978, and I was more oriented toward BASIC and microcomputers, even though it was a few years before IBM brought computer power to the people. I think even then there was talk about how COBOL was becoming obsolete. As you point out, it’s a remarkably easy language to learn. The idea, according to Adm. Hopper was that businessmen and accountants could write their own programs. That never happened, of course. The real problem, as you point out, is the many years of encrustations, “bug fixes,” “updates,” and added bells and whistles that are not documented and for which the source code has been lost. Companies have become completely unwilling to pay for training their workers for anything. They also won’t raise wages to reimburse workers for investing in their own training. They think they will always be able to get H-1B visas to bring in young kids who have been trained at public expense in India and paid half what they were paying the American workers who are required to give them on-the-job training before leaving. This is not going to end well.
Thanks for pinning the tail on the right donkey. The above interesting post defines a technical problem and those are the ones that can be solved if anyone has the will to do it.
All right, what are the regulators then supposed to do about the mess?
Actually, I just wonder whether the reluctance to apply one of the remedies suggested after the 2007-2008 crisis, i.e. splitting banks along lines of business or national subsidiaries, was not also due to the fact that it is next to impossible to split the corresponding IT systems.
For one thing they were supposedly charged by Dodd Frank with ending too big to fail. They haven’t. For another I believe that the government has the power to force banks to act prudently given their FDIC backstop. And if they don’t have that power then Congress has the ability to give them that power. My reading of the above article is that this is a failure by the banks to throw enough resources into solving the problem. They could pay young people large salaries to master the Cobol system and start migrating away from it despite the cost. It’s like the typical IT complaint that companies can’t find Americans to do the work when what they really mean is that they don’t want to pay Americans enough money to do the work.
Neoliberals love “free markets” as long as they are on the winning end of the transaction. When things turn sour suddenly it’s whocouldanode.
And, while I agree that vlade’s greenfield approach might be the best way to rewrite financial systems, hard experience with rewrites suggests that you’re as likely to cause a crisis with a rewrite that misses something subtle as you are to cause a crisis by leaving ‘obsolete’ technology in place.
I believe that the idea of a “greenfield” development is that it also offers the opportunity to reset the entire system: reset the entire offering, clean up account structure, get rid of those 50 years old savings products, etc. Then tell customers they have to migrate to the “new bank” and cram down the new general terms and conditions.
This is basically what well-established banks did when they set up separate internet-only banks, with separate offerings, management and branding.
and how many customers will the bank loose because of the cram down? will the cost of that and the loss income, cost the bank more than the estimated cost savings (and these tend to be mirage than real, as they tend to be ).
but as noted, the risk from starting over isnt small, and the cost isnt either. and the savings may never happen. and there have been many examples of large projects to do this failing. this is basically what kept Greece in the EU, cause they couldnt recreate their old currency? and that was a much simpler task than the one’s a bank would have to do to replace the code that is their business.
its like trying to replace you car engine, while your diving it,
The system includes the expectations of users, customers, owners, and upstream and downstream dependencies of various sorts; ‘greenfield’ focuses on keeping all of those things while completely replacing the technical internals. So, to be successful, greenfield has to perfectly emulate what everyone wants to keep while eliminating only what no one wants to keep. IOW, Jerry Weinberg’s words to be specific, ‘No matter how it looks at first, it’s always a people problem’.
or its the DWIN function
the Do What I Need
Popular love for programming languages seems inversely proportional to the size of their “hello world” samples. Java and COBOL are “dowdy” because they have a ceremonial boilerplate problem. Fortunately, Java (and COBOL) are not the only languages that run on a Java VM, and there are other languages that enable more provable programming constructs and paradigms. Then again, anything’s more provable than cowboy coding.
(Bonus: Oracle gets to keep its rice bowl!)
Has the Ada – aviation and defense safety critical structured language ever been applied to banking? It was a 1983 mil spec originally developed by DOD for real time systems.
and when was the last time any one talked about it?
I respect Grace Hopper’s development of the practice of computing but some bad ideas — like flowery diplomatic levels of formality in English-like programming languages, and managers treated as first-class enterprise programmers — ought to be consigned to the same dustbin of sentimental habit and ritual as syncing twice before shutting a Unix system down.
Ada is of that Hopperite tradition: huge headers that resemble cover sheets for fax transmittal including not particularly helpful declarations sitting atop a few lines doing actual, real work. That’s why it seems more people try to reinvent the same with lighter cultural baggage rather than actually using Ada, even with GNU implementations available. Hardware design languages, same deal: VHDL, another Hopper-descended language with all that entails, has lost much ground to the more Pascal-like newish-comer Verilog, especially in the civilian space. And since then SystemC has become a thing, allowing even more terse design of hardware.
And yes, terse coding is important: Paul Graham’s essay Succinctness is Power references none other than The Mythical Man-Month, claiming that programmers produce about the same number of defect-free significant lines of code per day, whether in IBM assembler or higher-level PL/1. It seems intuitively reasonable to me that screen real estate and visible characters substitute for what will undoubtedly be a very crowded working (human) memory in much the same way as scratch paper, and that a tighter (yet halfway familiar, *cough* APL) notation for specifying functionality and connectivity can be more productive.
While some languages are less prone to being abused by programmers than others, ultimately they are
just a tool. The most significant aspect is the design and architecture. Give a group of developers the same problem to solve and you will have different solutions without fail. Some will be concise and elegant, others will be mind blowingly complex.
Now throw in a database component and all bets are off. I have seen first generation designs that are
convoluted garbage. Depending on their experience, many developers might not even recognize it, or for that matter really care. Now imagine years of further development and you may very well end up with a monster.
Software development has matured considerably in the last decade or two. Most modern programmers are aware of data structures and design patterns. These are concepts that were not very well defined or understood back then. Programming is part science, part art.
I’m wondering if languages like COBOL and Assembler are only easy to learn for people w/ the aptitude for them: Took CUNY classes about 35 years ago in those languages and they totally went over my head. No textbooks–there were textbooks, but the instructors hated them and would not use them; they relied on their lectures and the “hieroglyphics” on the blackboard, and no computers to type programs on for the most part, just computer cards that you punched in the letters, numbers, and commas. Took your program to a computer center where they ran the cards through a machine. If it worked, you got some data, if it didn’t, a blank sheet:/.
Maybe it’s easier today with computers where you can actually see the molding of the program as well as on line tutorials and advanced teaching methods?
yes, its a lot easier. cards went away a long time ago
Unfortunately you’re completely right with the problem being (IT) Management being short-sighted. And this is not only going on at banks. Other big companies are also doing the replace expensive specialists/contractors with cheap people thing. This will simply not work, but when this becomes plain to see, it will be hard to solve the resulting mess.
It’s so stupid, but it boils down to not recognizing that lots of businesses today are IT shops, that are operating in a certain field, instead of the other way around. A bank is, at it’s core, totally dependent on IT and lots of companies in other areas are as well. But they will not acknowledge this fact and act accordingly.
Yes, the analogy I always think of is a craft bakery, whose product is bespoke artisan bread, cakes and other non-standard custom foodstuffs. If they are in that business, over time they create their own unique recipes, adapt other ones to meet their specific needs and the needs of their customers, gain skills in how to manage this highly individual business consistently and profitably.
Then one day the owner recruits a new manager. He looks at the bakers, the kitchen staff, the ingredient suppliers, the finishers and says “There’s a guy who works down at Wal-Mart on minimum wage that puts frozen dough in an oven each morning and a few hours later stacks the shelves with 200 loaves of white sliced. What we’ll do is get rid of all these people (points at the existing bakery staff) and hire him instead. It’ll save us a fortune.”
Agreed. But here’s the thing. The solution to short-sighted management is to allow the firm to fail. Incompetence has to lead to bankruptcy, or else you are actively selecting for incompetence. If this is impossible because TBTF, TBTJ, etc., then that is, by definition, a management problem, not an IT problem.
The reason that Walmart and artisanal bakeries can both exist is because the customer is paying the cost, a cost that differs greatly* between the products. Some customers want fancy financial products. Others just want their paycheck to cover rent. Forcing renters to subsidize rentiers encourages short-sighted decision making processes by overriding the price discrimination in play that is allocating resources between those who want baked goods specific to their needs and those who want sliced enriched white bread-like product*.
Corporate management isn’t actually being short-sighted at all. Rather, they are externalizing their individual firms’ costs onto the public finances. It’s not a problem. It’s a solution. (From their point of view.) The problem isn’t legacy code at all. Rather, it’s management. The healthcare.gov rollout should have demonstrated that beyond any doubt. That’s a brand spankin’ new effort that was a very public failure that had nothing to do with technical competence of the American workforce or legacy code problems of the website.
*Of course, we could ask why inflation and inequality are so extreme that anybody would choose a bread like product over an actual baked good, but that’s getting really meta about both political economy and the meaning of the concept of choice.
The most significant issues with healthcare.gov were the interfaces between its brand spankin’ new code and the external legacy systems with which it needed to integrate. Which reflects an issue with the technical competence of the project workforce.
You appear to be new here. Posting inaccurate information is against or site policies. This is not a chat board, and commenting here is as privilege not a right.
Lambert Strether posted extensively on the healthcare.gov technology train wreck, and he was the first to call that it would be a disaster. Why? The Administration was requiring major changes in functionality months, and if I recall correctly, it was a mere three months before launch. Anyone with an IT background will tell you that that sort of poor behavior by the client will make it impossible to meet a hard deadline with a workable product.
Look, I speak COBOL and there’s nothing mystical about it. It’s a garbage language in that it’s strongly typed (which I hate) and has limited output (meaning your print fields and console are also strongly typed, which I doubly hate). Other than that it’s no more complicated than FORTRAN or BASIC.
Now ‘C’ requires a paradigm shift in terms of integrating pointers (to, well, everything if you care to) and groupings of disparate data (structures) but COBOL doesn’t even allow those kind of constructs unless you get real Turing (any problem can be solved by a conditional jump) on it. Didn’t get that? I can go slower for you PASCAL people (yuck, all the drawbacks and none of the benefits).
The point is that it’s not a language failure, it’s a documentation failure. My genius bit of programming (other than using a hash table to eliminate duplicate data) was setting up an array (only data structure I had) that would handle multiple answers of multiple instances.
Now my cousins had a tough job because while I had a masterly 64K and 10 Mb Kaypro they had to work their magic in 256 Byte (yup, seriously) robots for GM.
The problem is that you have to leave instructions as to what the code is supposed to do otherwise even you won’t understand it in 6 months. When I got religion on structured coding my comments on each operation soared to 10 times the volume of my functional instructions (writing for humans is a cheat, sooo much easier than poetry for machines).
Some people think that expanding labels is a solution and it is to an extent but nothing matters as much as communicating your intent and noting the limitations of your subroutines.
And THIS is what makes working with a legacy base hard, not the language.
Yes and no. Most language implementation have quirks and weirdnesses, stemming from bugs and ‘features’. It’s less common now, but was fairly common some time ago. Yes, it’s not a language problem per se, but it’s a platform problem, and sometimes it’s hard to separate the two (I’ve seen quite a bit of code that relied on the knowledge what exactly the compiler was doing). And often these things didn’t get documented, because “everyone knew them”. But few thought if anyone will know the 30 years on.
There you go (test?if:then else).
As you can see I prefer ‘C’.
And I could junk it up with *trash.
/* *trash points to pointers().*/
And that’s why people hate ‘C’, but I love it. It’s only as obscure as it needs to be.
C is my favorite language… but C’s ability to directly access the machine, and C’s delegation of memory management to the programmer are directly responsible for many of the big security bugs you’ve heard about… and for many that highly skilled teams at places like Microsoft are searching for but still haven’t found in Windows and Office and other similarly widespread programs (e.g. OpenSSL). Google ‘buffer overflow’ for more. C won’t be our salvation, it may well be that crisis of which we speak.
As for inline assembler, it should be a feature of every language, it just makes your code unportable. I like the fact that ‘C’ allows for the abstraction of that which makes it easier to identify in the worst case and simply happens transparently in the best.
Memory management has never been a problem for me except in cases where the complier implementation is flawed (unfortunately encompassing most Microsoft products).
Your comments reveal the role of human nature. Everyone wants to take shortcuts, but over time, quite painfully, one trips over pitfalls, and eventually bites the bullet and learns to do things the “hard” way (which turns out to be easier in the long run, 6 months later.) In my limited experience, the point of structured languages is to save you from yourself, and from “documentation failures” that arise from impatience. I’ve written a bunch of small to medium software projects in a scientific/academic environment (my favorite was Object Pascal; self-documenting interfaces; cath errors at compile time, too). Boy, do most students HATE to work this way; but only for the first 6 months.
Many of others’ objections criticized management’s behavior for taking analogous shortcuts in their own sphere (short-term planning, cheap-quick labor). DO-IT-FAST is a universal hazard. When time is money, this attitude translates to greed, but avarice is a tendency that encompasses more than money. For many reasons, good and ill, there’s a universal inclination to cut corners. It’s not entirely unhealthy. But institutions like regulators and structured languages are reviled precisely because they put barriers in place to moderate this tendency.
Critical entropy increase appears to be a common feature to both programming and management, and elsewhere. We’re discussing loss of employees who have knowledge of their company and how it works, alongside expansion of software complexity. In other words, loss of information and development of disorder. (Has anyone in the business world tried to extend Shannon’s work in this direction?) To pursue the metaphor, as time goes on, systems tend to convert energy to heat, and we’re facing the difficult problem of keeping our institutions in the space between meltdown and freeze. But who is minding the store? Oh yeah, that would be the free market. Profit motive will automatically fix that. Right.
Well, if you’ve ever implemented a keyboard remap or done any encoding, treating strings as numbers is sure handy, as is the ability (if you do screen animation) to swap video memory between display and working. Pascal and Modula 2 are no panacea, I can write bad brain bending code in any language.
Adding to my post at 6:12
There is a trivial reason why COBOL code became legacy.
Much COBOL code predates SQL and depends upon the clever layout of blocks (Indexed Sequential Access Method) on the physical storage medium (Direct Access Storage Device).
So, changing a query, or adding a column to a database requires a completely reorganizing the layout of the storage. Things that are done more or less automatically now by SQL, but that still blocks access to the data during the re-organization.
So, rather than alter the ISAM, COBOL routines are chained together, both sequentially and in parallel. This synchronization is incredibly fragile as it often depends upon undocumented side effects.
As someone who has done QA on these systems I can state that they are not deterministic. The control model for the system in development is the actual current system, not some formula where the output is the result of given inputs.
The systems could be reverse engineered via machine learning. But. Adaptive learning would require somehow instrumenting the current production systems. It would be very difficult to label and understand the knobs on a neural network back office.
The biggest impediment going forward is the power shift at the CIO suite. As the current system has no competitive challenger no manager will risk change, even if new knobs could be insanely profitable.
Yup. You would cheat for performance in all kinds of ways that are simply not necessary now.
Had a 9 track tape drive at one point.
When people talk about the sheer number of lines of COBOL code to manage/maintain/migrate, one element should be kept in mind: many of those those programs have probably been generated from higher-level tools.
When I was developing MIS software in COBOL (decades ago), a majority of the programming actually consisted of embedding SQL statements (or your favourite data manipulation language, there were several), as well as high-level directives for (then forms and character-oriented) user interfaces, which were then translated into standard COBOL. This resulted in very long programs — lots and lots of COBOL statements invoking low-level interface, database, networking, etc, library functions. The actual COBOL programming took perhaps 20% of the total.
The problem is therefore not just the language; it is the entire tool suite (database management system, user interface toolkit, report generators, etc) that must be carried over.
What vlade states above is probably correct, and jibes with the “common wisdom” that above a certain percentage of required re-engineering and rewriting, it is more efficient to build a new system from scratch.
On the other hand, many small and some medium sized banks that had to deal with the dual issue of Y2K and Euro have already migrated to more modern systems — standard suites sold by the like of iflex, Oracle and Temenos (think SAP-like ERP for banks).
The biggest banking IT project disasters I’ve come across occur when a bank attempts a radical overhaul by replacing its entire legacy system estate with a single, new, solution. Often “bank in a box” suites are selected such as i-flex (Flexcube). Many Too Big Too Fail banks have, at some point, attempted such a switchover. The capacity of such projects which, while laudable, can result in massive cost and time over-runs — and eventual cancellations — to go spectacularly wrong is virtually infinite. Usually the bank and the software vendor are too embarrassed by the scale of the failure to ‘fess up and either simply write off the asset impairment and hide it in the accounts somewhere or settle quietly out of court. Occasionally, details do surface publically such as here http://www.rte.ie/news/business/2011/0131/297173-aib/
The stories which do make the press are the tip of a much larger iceberg.
Yep, I agree this doesn’t work. Which is why I advocate “new bank” (as in really a new, separate, legal entity) solutions. It still may not work, but is, I believe, a more feasible solution than in-place replacement.
I was part of a new green-field development of a new core credit-card system back in the 1990s, and we establised a model-bank precisely for this purpose. Unfortunately, a merger scuttled the project.
In the world of three variables, cheap, fast and good, the new Bank invalidates at least two the three variable.
Fast will not happen.
Cheap will not happen.
In an effort to control these two variables, good evaporates, resulting in the sixth of the six phases of project management. Punishment of the Innocent.
IT systems are not just IT. They always encode human business knowledge and the longer the old IT system has been running OK the more likely it is that the actual humans have just plain forgotten that knowledge is even in there somewhere. Which means the shiny replacement is often a laughable oversimplification that does not work so well. This can have business consequences.
here is another lovely failure in sweden some years ago
….but unfortunately not all the details
CommBank declares core bank overhaul complete
It is ranked in the world top 50 banks (by assets)
It’s not clear what “overhaul” means.
The tell is in this line (emphasis mine):
So there’s an unstated number (both in terms of customers and product types) of non-standard accounts which can’t be hosted on the new system and are being maintained on the old one. Notice there’s no mention of decommissioning anything. My hunch is that complex products (market linked accounts, loans with novel features like payment holidays, credit cards, mortgages, that sort of thing) sit on at least one, or more likely several, “legacy” systems.
Recently one of the big regional in the US decided to go with Oracle Financials and retire their current legacy system-it’ll be interesting to see how well that conversion goes.
When one has to be “clever” in organizing one’s data, one must understand it and that understanding spills over into new features and new systems for the customer or users of the system. When one just asks the language to automatically add new elements or new queries one does not need to understand the data structure on the device. This leads to inefficiencies which has a compounding effect as the inefficient data structure is accessed by other applications as time goes by. But, lucky for all of these inefficient languages the hardware has masked their wasteful structure–but there is always the day of reckoning, the day that the structure cannot support new needs.
TBTF. Too Big To Fix.
That the eating of seed corn in the big financial services firms is running out of steam has become so obvious that the big consulting firms have moved on to the next thing – robots! Now, those without vision might question how leaders who can’t solve simple problems will find the wherewithal to masterfully roll out automation technology to address complex questions. Which misses the point which is to convince shareholders that you can do so, to keep the ball rolling until you can pass the problem on to your successors. I say that sincerely, not as sarcasm.
At a holiday barbecue I spoke to a profeesional couple in their 80s and long since retired, and touched on the challenges posed by this hollowing out of talent. First they’d heard of it and they were quite interested and then alarmed and then one of them asked a surprisingly good question: what happens when people like you retire?
1. I’ve been hearing variants of that attrition question from the 90’s, which first came up in the context of early retirement programs. But it became especially painful when people with mid to senior level technical positions were laid off and found themselves unemployable. Suddenly large sectors of technical talent became useless, meaning effectively that entire sectors of intellectual and technical endeavor came to a halt.
2. Cost isn’t the only reason management doesn’t want to pay for expensive talent. With greater emphasis on generalist “business practice” instead of specialized hands-on experience, management has lost its ability and inclination to manage higher level “detail” projects competently. I’ve seen outsourcing used on projects that management might have developed in-house more economically, but which was sent out because said managers lacked the ability to manage it themselves.
It may be that this loss of in-house expertise just represents a phase of a changeover from old ways of doing things to new ones that are just as effective. Salman Rushdie once observed that “Morals are always declining,” but we’ve still managed to muddle along. Maybe the same principle applies here.
Probably, not much different than other rat nests. Virtually, no comments, the original programmers where learning on the job (GOTO, Jumps), while texting slang might be new it is really not (Variable names less than six characters), compiler issues, editors twenty to forty years ago were not like today editors. Someone will have to bit the bullet for a capital improvement, to clean up the mess. It’s just called work with no easy button. And it repeats thirty to forty years from then
Apart from the technical inaccuracies, especially those on ISAM and DASD, which will just become a pissing contest the problem is financial, not technical.
If your favorite enterprise spends 2% of its personnel budget on code, and I assert ALL code becomes legacy code, over 50 years, then they now have at least 100% of their budget invested in code.
Code lingers, un-depreciated, un-maintainable, just like an old building ready to be demolished.
I wrote 360 assembler (and my programs are still in use), PL/1, Cobol, Basic, C++, Pascal, Delphi, and Java. Built system when install ran 7×24 for 8 years without a bug.
I’d fire a programmer who use these constructs: items, getmain, alloc, newed an object, and the if statement. The only way to write solid code is using Finite State Machines.
That could have been myself. This was the essence of the comment:
This article, quite frankly, is bullshit. I’ve made my career living and breathing COBOL and assembler in financial systems for over 30 years in some of the largest financial institutions in the US.
First off, software doesn’t rot. Second, COBOL is dead simple to read and understand. The REAL problems are missing source (which should never happen and should be a fireable offense if it surfaces) and poorly-dcoumented interfaces; and companies trying to get by on the cheap by not training replacements for their retiring staffs and instead offshoring/outsourcing in order to make their numbers ‘look good’-that can only lead to disasters like what happened to St. George’s in Australia a few years back when their core banking system was offline for days.
Right. (/s) Parts of ALIS (Advanced Life Information System, an IBM product and documented well) were almost impossible to understand a few years after release.
We spent hours pouring over code to understand when it had broken.
Maintaining code is hard, because you have to get into the programmer’s head, or intent. In addition, Management does not pay programmers to maintain documentation.
and then you state the problem:
You can do it, I can do it, and no one wants to pay us to to it. Despite my history, I’d not get hired today – too old.
And if the companies don’t want to pay to do the job? Well, that as business decision they decided to make, and they get to live with it.
It’s unfortunate that the entire history of modern computing runs in parallel to the history of the neo-liberal consensus with its tacit insistence that the price of all sorts of neglect can be booked as ‘profits’.
The folks in the ‘C’ suites have been neglecting not only R&D, but basic maintenance, so as to pad the bottom line and thus make next quarter’s bonus check bigger, with predictable results.
Add to this, a died-in-the-wool hatred of labor, yes IT people are laborers, and you have a recipe for our current predicament.
Our planet’s IT infrastructure is in the same situation as our transportation infrastructure.
I think my favorite is the second para
Golang was that skunk works MicroFace project where they merged the very worst aspects of Go and Erlang. I’d say more, but I’m under a SoftBook NDA.
if you done enough money to test the code, then you dont have enough to replace it eithe
Legacy system is another name for
“Systems that work”
Agree completely. I’m sure some of the clearing houses are still using os2 on some stuff. Swift?
If it’s that old, you know the probable attack vectors. You also have an idea of how and when it might break, and therefore have a good idea on how you’ll have to fix it.
Reversibility is key with banks. It just has to keep chugging along, recording….it can always come back later, at the end of the day, or week, or month…
Not having the ability to record kills bank IT. Everything else can be summed up later, by whatever other means.
If it ain’t broke don’t fix it.
Keep it simple, stupid.
When NASA sends space probes on multi-year missions, they use the simplest and most primitive computers they can get away with.
So our banks are using old, simple, proven code, that uses a language that is almost unknown to your average Elbonian hacker, and that can be readily understood and validated by mortal human beings. I fail to see the problem here.
Instead we should shift to massive complex hard-to-understand multi-gigabyte code bases that are constantly mutating as software companies practice planned obsolescence, that provide multiple entry points for hackers, etc? And think of all the exciting new ‘financial products’ the banks could come up with! How about: recursive fractal non-causal mortgage loans?
Remember how toxic ‘financial innovation’ has turned out to be? How about we just keep at least the core of banking simple?
And in the original “Battlestar Galactica” TV series, didn’t it turn out that COBOL was the source of all life?
By definition, the second a system hits production it is instantly ‘Legacy’.
Who knew Naked Capitalism was where all the retired programmers/systems analysts dwelt? 62+ comments on such an obscure topic ? It must be like sexy time for Seniors :)
‘Who knew Naked Capitalism was where all the retired programmers/systems analysts dwelt?’
It’s a good thing. I’m learning stuff I didn’t know about from people who possess the relevant, not-widely-distributed competencies.
From Clive’s spot-on statement:
“At the C-suite level, the drive for bodyshopification continues unrelentingly — take experienced, knowledgeable but expensive subject matter experts, shovel them out and get in someone (maybe two or even three different people, it doesn’t seem to matter) at 1/3rd the cost in India, the Philippines, Poland, it really doesn’t bother anyone at all where, instead. All they need to do is have a months’ handover and get a link to a SharePoint site with the “documentation” and off they go. ”
That’s the higher level problem. I suggest that if banks were still run by old school bankers who actually understood what their bank does and how it does that, who understood the comprehensive scope of processes and transactions start to finish (which requires long apprenticeship and experience), then the banking system could be analyzed properly before any new coding started. While that analysis would be very time consuming it would stand a chance of success. However, TBTF banks are now run by people who know the theory and not the details. They have only a partial and walled understanding of the whole process (see failure to train underwriters e.g). Systems analysts aren’t bankers but must rely on the bankers to say what needs to be done. But the bankers at TBFT no longer understand banking. They rely on the computer to do what needs be done to handle the processes. The bankers have lost the skills of understanding what the processes really are or how they relate to the whole. Rather than being able to guide programmers about the whole, the bankers leave to the coders to figure out what the old code did and why. The blind leading the blind. My 2 cents.
Not just old school bankers but old school banking where the bankers were partners with their own fortunes on the line if anything failed. Probably the only form of “market” discipline that will ever work in the financial sector. I can’t really imagine a partner risking personal bankruptcy by putting their bank’s IT systems in the hands of a bunch of $2/hr coders from India/Phillipines/Poland/Uzbekistan …
see the old poem “The blind men and the Elephant”
We don’t have apprenticeships we have education, which while good to an extent often degenerates into endless alienated paper chasing because that’s “what you need to be hired”. The employer doesn’t have to pay for any training and the degree may be mostly signaling.
We don’t have dedicated expertise, we have deskilled as much as possible and got for as cheap as possible labor putting in long mindless hours because they are cheaper that way. We run very fast to get nowhere.
I was thinking just that on more than one occasion today, as various keen, eager to be helpful, trying to make progress developers, analysts and specialists struggled manfully (and womanfully) to solve problems they hadn’t in a lot of cases the foggiest idea where to start with. A few others with a little more experience (maybe they’d been on the account for about a year) then made valiant but largely futile attempts to fill in the not trivial amount of blanks.
For me, I knew (and not for the first time or likely the last) what led Louis B. Meyer to exclaim “nobody knows anything”.
If I didn’t know better I’d swear you must work for the same TBTF!
It’s called technical debt, and COBOL mainframe is just one small facet of it. IMHO, it is not an FSI driven problem. They are just most vulnerable to it given how mission critical their systems are. This is a “feature” of the Tech Industry rentier business model.
Name a technology and it has the same issues. SAN storage, flash vs DASD, old versions of windows servers, good luck patching java on every server with the latest release from Oracle every other day. The whole ATM world runs on legacy Tandem technology now owned by HP. Message brokering, middleware, DBMS, core and edge networks, firewalls, MDM, ETL, I can go on and on all the way up the stack to the actual business apps. All of it becomes obsolete. All of it is intertwined. And all of it is designed to lock you into being unable to replace it.
That is why IBM has 90% software margins. If it was easy to replace, it would be 35%.
Greenfield is being tried by a number of FSI’s. It’s still not cheap, requires significant upfront dev investment and if you are a multi-national you will need multi-region data center capabilities or make a monster bet on Cloud. Good luck getting multi-jurisdictional regulatory approval to run systemically important production code with customer data at Amazon or Google. (BTW, the Fintech startups live in the delusional world that this will just magically happen one day. Servers, storage, switches? Just fire up a container on AWS!)
I’ve seen my fair share of C-level execs that want to ignore this problem. But I’ve also worked with some great ones who are in the middle of dealing with it everyday. The business cases, just for mainframe conversions, typically look like a five year project that costs 10x the annual run rate. With chances of success commensurate with the post. It introduces far more risk than fortifying systems with proper upgrades and design.
And yes the problems are more pronounced in FSI, but this is a global tech industry issue. Tomorrows brownfield starts with today’s greenfield.
The problem is not so much the language.
It is that nobody is trained in writing it any longer, because programming have problem with always chasing the latest “shiny”.
On top of that, the code was written back when RAM was expensive. This to the point that removing the first two digits of the year, after all it only changes once every 100 years, and by that time something newer and shinier would surely have taken over, allowed the processing to be done on a cheaper computer.
All in all, programming and corporate economics do not mix. This because corporate economics expect a purchase to be run to scrap before being replaced, while programming expect everything to be replaced every decade or less.
Well is there any incentive on the part of the person doing the work NOT to chase the shiny? If one was to set out (some might say sell out but be that as it may) to become a COBOL programmer, I imagine it to be somewhat boring work, but in exchange for that, is it even guaranteed to pay the bills or will one be passed over for promotions, higher pay, or even jobs themselves for someone with “shiny” on their resume? So that shiny gets all the raises etc. because afterall they are so shiny ….
Sure you forego the ego boost of bragging about your mad skills by being a COBOL programmer, but does it even pay off in much more mundane terms of a steady paycheck? Will you be aged out (can’t be doing something cutting edge like COBOL when your 50 afterall! We’re only looking for freshly minted young 21 year olds who dream of working in COBOL …) or outsourced?
“All in all, programming and corporate economics do not mix. This because corporate economics expect a purchase to be run to scrap before being replaced, while programming expect everything to be replaced every decade or less.”
+1 and cost cutting, corporate economics wants anything new to be done on the cheap. the old systems may work better simply because they were more willing to invest in them initially.
noone can blame workers for looking at every job only as a stepping stone to the next job, what choice do they have when the system we live in is musical chairs forever and more seats are taken away with every recession and every trade deal. No job can even be worked on it’s own merits or entirely for it’s own sake really, only a less precarious means of survival would allow that.
We installed the early IBM 3600 Baking System in a Bank in South Africa in the 1970s. I then left South Africa, because there was a small black cloud on the horizon.
When I return on my mother’s death, in 1997, I went into a branch of the same Bank, and there sitting in the branch, in use was the IBM 3600 banking system, now over 30 years old.
Who could have know, in 1975, that the programming and technology from the ’70s would still be in use 30 plus years later, and not replaced by PCs and Microsoft products.
I do wonder from where the Bank got spare parts.
I remember once the TBTF had to google around looking for spare IBM 3700 parts — disk controllers, token ring adapters, that sort of thing. It was like an episode of The Clangers. All it lacked was an appearance by the Soup Dragon.*
(* non-UK readers, please don’t worry, you have to have been a child in 1970s Britain to understand this reference, apologies…)
The clangers had a very practical approach to social and technological ecology and they seemed pretty happy with it. I think they developed after the wombles had cobbled together spaceflight technology from wimbledon’s discarded hoovers and twin tubs.
Heh, i think i was exposed to a translated variant of Clangers during my early years.
Way too early for me to get any references tough, and quite possibly lost in translation anyways.
Actually you do raise an interesting question, what lifecycle should software be designed for. The whole y2k problem came about because in the 1970s no one thought that software would last 20 or more years (it had not up till then thru at least 1 generation change (from the predecsssors of the system 360 to the system 360 all be it the 360s could emulate the 1401 and 7094 systems). Or recall that due to the 32 bit size of variables that on Jan 19 2038 time on unix machines will roll over. This has been fixed on newer systems where it will take until 4 December 292,277,026,596 CE to rollover (64 bit time variable). https://en.wikipedia.org/wiki/Unix_time.
As noted in earlier posts the tradeoffs in programing have changed over time. Originally machine time, and memory was the precious component, and human time was cheap. This has changed somewhat in that memory and machine time is far cheaper than human time. Thus different tradeoffs would be made.
CODE is always easy to change. DATA, not so much.
Not so (changes are more data than code). And the desire to do *anything and everything* to avoid having to modify legacy code can make the complexity problem worse so this sort of thinking needs to be applied with caution.
To give an example, when the TBTF’s payment system had to be changed to implement International Bank Account Number (IBAN) input (a regulatory requirement which meant that customers needed, if they insisted, only quote the IBAN number they wished to make a payment to) there was a huge architectural discussion about whether the COBOL-based legacy code should be updated to facilitate this or whether we should use another system to host a new IBAN lookup table which could then convert the IBAN into the data which the payments system could already accept.
Some technicians made just your argument — this is “only” a new data type and there was no need to touch the code. It was “easier” to do the data transformation elsewhere. Other architects said no, IBANs were, while just a new data type, now critical to processing payments and the core payments system’s API had to be able to handle transactions using the native IBANs. Implementing a separate look-up in another application would add to the spaghetti systems problem.
Eventually the decision was made (rightly I think) to bite the bullet and update the legacy code. But in no way are such decisions clear-cut. Often, lowest first cost wins out, regardless of the long-term maintainability. There are very, very few CIOs who can parse this kind of technicality to those others in an organisation whose understanding doesn’t extend beyond crude cost calculations which are never adjusted for risk. The balance of power is hopelessly skewed.
Those people in 4 December 292,277,026,596 CE are truly screwed. In the year 292,277,026,595 CE, companies will have to budget 2% * 292,277,026,594 – 1,960 or about 5,000,000,000 years of budget money.
Complaints will about about the use of web sites from the year 10,000 CE.
The problem grows at about 2% every year.
Likely ol’ IBM was happily supplying them, at ever inflated rates, the whole time.
Way not my industry, but I read this article lamenting that COBOL is “old” and the programmer pool I aging.
“… remember using it fresh out of school in the 1970s and 1980s. It’s the IT equivalent of hieroglyphics..”
So how about English language, fairly old but still enjoys some popular use. Should we be replacing it?
If there is nothing inherently broken w/ COBOL , no less it has a fantastically successful MTBFrun, what not just train more COBOL literate programmer, retrofit it to new hardware?
Cobol – do banks speak our language?
A key problem with Cobol is that even though it is the third most widely used languages in financial services applications (research by CAST Software), only a dwindling pool of developers is Cobol proficient, particularly those with the knowledge of the context in which the systems were implemented. They have become a relatively endangered species. It has been said that IT workers with Cobol in their kit bags have the safest – and some of the best paid – jobs in the City and financial institutions are terrified of facing up to the day when these people might retire.
Computerworld conducted a survey back in 2012 and of the 357 IT professionals who were questioned, 46% said they were noticing a Cobol programming shortage and 50% said the average age of their Cobol people are aged 45+.
fist error, getting you info from Computer World. they have always been more of a vendor mouth piece than any thing else
LYIT, a small institute of technology in North West Ireland have a post grad program ( Higher Diploma in Arts in Financial Services Technologies) which focuses on cobol. They have a 100% graduate employment rate.
I do IT consulting , Project/Program Management, for medium and large clients including some of the “Systemically Important” banks. I continue to be amazed how poor these organizations are at managing and executing Projects of any size. It seems to be a combination of inexperience at all levels, partially due to the “functional decomposition” of the organization, and partially due to the ignorance and unwillingness of senior management to entertain the real level of effort and expense of doing IT projects properly. They don’t want spend the initial money but actually spend far more dealing with bugs and their fallout. It seems nothing is ever learned. It doesn’t need to be that difficult. The language, COBOL, is irrelevant to the problem. It is probably one of the better ones for these applications. (Although I am not a fan)
The article and postings seem to me to be a fair statement of the world of IT. But what is being missed here is a deeper level problem. The problem of poor or non-existant analogies, poor and mostly non-existant grouping schemes.
What the IT profession is missing is a systematic grouping and classifying of their work – software code. At present it is like if doctors could only talk about the human body as a very large group of cells. Accurate but not helpful to pretty much anyone, doctor or patient. This level of detail is much too low a level. It misses how cells can be grouped into organs, cells that share similar functions, inputs, and outputs, that can be seen to ‘belong together’, to be bound together in a definable fashion, examples are the heart, liver, and brain. Then organs are grouped into systems of function, a group of organs that together perform a coherent function, examples are the nervous system and the circulartory (blood) system. Without this systematic grouping by stated classification schemes, you are left thinking the forrest is simply a lot of homogenous trees. (It’s not, in case you were wondering…).
Now, when a coder says they are replacing some lines of code or altering a routine or adding some function to the software, they really can not state at a higher level what they are doing. It would be like the doctor telling you he wants to replace some cells in your body. What will happen to you is not very clear. What is required of you, your time, and your wallet. And what cells in my body? But when the doctor talks about the body in mid and high levels of classification schemes and analogies, he would say you need a heart transplant. Now you have a clear idea of what is required by both the doctor and you.
As a former coder and software tester, I know something about the great weaknesses of the IT industry. I will only point out here what I think is the true problem with IT and ‘obselete code’. And I am not optimistic that the industry, companies, and individual professions will try to get out of the nitnoid details of code and start the effort to classify their work on analoguous criteria of the human body functions, systems, and interactions.
Just an anecdote, from many years ago (I’m a complete dunce on this subject), but it seems relevant: It’s from a now-old book called “Fleecing the Lambs,” about Wall St., that Yves is probably familiar with. Some of this stuff has been going on for a long time. The anecdote is from the introduction of hi-tech equipment in the brokerages.
One brokerage, probably not named, was convinced to ditch all that expensive paper and go strictly digital. The system crashed, of course, and that brokerage ceased, expensively, to exist. And yes, I saw the point that these legacy systems are remarkably stable – that’s why they’re still there. What about the larger systems they’re a mysterious part of?
There’ve also been a number of failed attempts to replace government computer systems. I think the root cause is the interlock between the computer systems and the equally-complex human systems (bureaucracies are a good deal like computers) that attempt to use them. Programmers and IT techs getting old and going away are just an example.
My brother was in the business of training people to use new computer systems – for the military, yet another example. I think I’ll try picking his brain when we get together next month, see if he has any really scary stories. He worked on NORAD.
Love yah, don’t ever change. There are reasons behind such sayings. Info transfer back & forth to new from old in steps, blocks, I thought that’s what packets were about?
Or could be.
People need to pay very close attention to what their engineers tell them. An Engineer & a Mechanic do not want their monkey wrench found in a crash, with their name on it.
They will go someplace else, at least in the nation which laws apply to them.
Managers & Directors have it fixed they can perform acts of Gross Negligence and drift away with minimums of 20 million somehow.
There are Systems Engineers who would tell one, or have as the case is set, what to do to make it all accounted for.
I am of the judgement that Banks with shareholders and investments now lust for Crisis. Every crisis is their fault and we are supposed to give them our money then to fix their crisis since we were made the reinsurer of their dreams.
It’s like I had work after people died in a test flight. Or we had work because the weather tore the pier apart. (That was a fun job, I got to run a small crane.”
All the bankers & gamblers are onto a sure thing with crisis downs and then the OK we can drive up monopolies.
It’s the rich wot gets the pleasure
It’s the poor wot gets the blame
Excerpt from The Red Flag.