Jump to content

Archive for the 'News and Analysis' Category

So what is it with Rogue Wave and XML & SOA?

Thursday, July 17th, 2008

Where do we start to describe what Rogue Wave has to offer in the world of Services, Web Services, SOA or even SCA?

Firstly, it’s important to understand that Rogue Wave is a leading provider of C++ Technologies: “we make C++ look like Java”, thus helping to save the cost of implementing common programming tasks…

Secondly, it is useful to review a bit of history: When web technologies started to emerge, people turned to Java and Tomcat Servlets to create more advanced Web Interfaces. In some ways, this was the birth of J2EE & Application Servers, and one of the biggest challenges was, and still is, to integrate existing C++ server business logic with this new Java front end. People turned to JNI for in process communication or CORBA for out of process interaction.

Because those solutions can be seen as overkill, Rogue Wave offered an alternative strategy by providing a native C++ implementation of Servlets, named Bobcat that lets you directly connect your C++ to the web protocol. It may not be the best tool to create a full interactive Web GUI, but if you only have to create an HTML table from a “C++ derivative pricing computation”, spending days developing a JNI interface might not be your best way… And with “AJAX-like asynchronous web requests”, distributing them across heterogeneous nodes makes even more sense nowadays.

A few years later, during the proving grounds of XML, people were also struggling to find a good solution to “parse XML in C++”. Rogue Wave introduced XML Object Link, still the simplest C++ XML Binding solution for XML parsing. Put simply, give it a schema file (the description of what is expected in the XML data), and it will automatically generate all you need to parse, modify, create and save that data. There is no need to understand and explicitly program the parsing logic like for DOM or SAX. So if you are tasked to create a programming API to deal with a complex schema like FIXML, you ought to have a look at it.

Lastly, let’s look at the current offering:

You can’t purchase Bobcat or XML Object Link any more, but the same functionality is now available in HydraExpress, formerly known as LEIF. But Hydra Express is really a tool to create and consume web services from C++.

You start from a WSDL, and you can create a “C++ Proxy” (a class that lets you call the related service) or a “C++ Server Skeleton” (a class you can derive from to implement the service). HydraExpress also provides you with a runtime environment to deploy the newly created services. So if you have a WSDL, you can be up and running with a C++ implementation in minutes: If you have to create a new IFX interface, a new Pricing engine…etc… with logic in C++, HydraExpress is the tool for you.

After all, the Web Services promise is to make heterogeneous systems works together. So, contrary to popular belief, it doesn’t need to be all Java.

Creating independent reusable services is of great value, pretty easy to do (thanks to HydraExpress and equivalent Java technologies), but “connecting the dots” between them and making them interact has turned out to be trickier than expected.

There are a lot of reasons why building an application by “drawing arrows” in a BPM graphical tool is doomed to fail. One must be careful about Central Points of Failure, Bottlenecks, Queues, Dynamic Load Balancing, and Failover… Without these, one will fall into the same traps that led to the Myth “SOA is slow”.

For this, Rogue Wave uses HydraSCA.

HydraSCA lets you build C++, Java or Composite Services based on SCA. HydraSCA is used to implement customer solutions with a focus on performance from the get go. Dimensioning and load balancing services is a key element for achieving a performing Service Oriented Solution, and HydraSCA provides those capabilities through an advanced distribution mechanism, an optimized runtime, and a pipeline theory.

Lastly, HydraSCA relies on a standard data access technology called SDO that has triggered off-the-shelf independent products. One of them is HydraSDO for XML that lets you parse XML files using a dynamic interface (unlike Hydra Express which is static). In tests, HydraSDO for XML has been able to offer DOM like random access to data at a speed and memory consumption close to SAX.

Is anyone using the WS-* standards?

Wednesday, July 9th, 2008

Back when SOA was the new, big buzz word there was a lot of talk around the WS-* standards for Web services.  Our company and many other vendors providing products in the space were doing a lot of talking about which standards were relevant, when would they be needed, and how they would work within an enterprise.

 It’s been several years now since most of them were proposed, and from my experience with talking with our customers, there is not widespread adoption or even need.  The one exception to that statement would be WS-Security.  There is always a need for security, and at a basic level encryption when doing any type of messaging.  When speaking with enterprise architects and developers, security always enters the conversation at some point.  However, one that had a lot of talk in the early days but hasn’t turned out to show real demand is WS-Reliable Messaging.  We have found that most people would rather rely on their enterprise transport mechanism such as a queing or JMS implementation.  Is there real need out there for this standard and we’re simply not seeing it?

There are also standards like WS-Addressing (needed for WS-Reliable Messaging) and WS-Transaction.  Neither of these has come up in a conversation that I have had for at least a year.  Are these viable, necessary standards that vendors should strive to attain compliance to?  Are there other standards that are absolutely required to implement Web services within an enterprise environment?  I guess until the masses rise up and declare the need, they might just show up as another check-box item when evaluating vendor products.

HydraSDO for XML with FpML Example

Thursday, July 3rd, 2008

FpML (Financial products Markup Language) is the industry standard protocol for complex financial products. It was first published in 1999 and is now managed by the ISDA (International Swaps and Derivatives Association). FpML is important because it is the XML specification for OTC (over-the-counter) derivatives and its use has increased substantially over the past few years, especially for interest rate and credit derivatives, according to a recent survey by the ISDA. The complete FpML specifications can be found on the FpML website.

Rogue Wave Software’s HydraSDO for XML product enables XML documents to be read and updated using the SDO (Service Data Objects) API which uses simple XPath notation. SDO is the industry standard for data access in a Service Oriented Architecture. HydraSDO for XML has extremely fast parsing capabilities and very low memory requirements, resulting in performance improvements for most applications. Both C++ and Java are supported, with shared memory access allowing a single copy of data to be accessed by both a C++ and a Java application.

The example below shows how a bond option FpML instance document can be read into memory, and data can be easily retrieved and modified using the Hydra SDO for XML C++ API. The instance document (bond-option.xml) is available from the FpML website.

Example

#include <iostream>
#include <string>

#include <rwsf/sdo/DataFactory.h>
#include <rwsf/sdo/DataGraph.h>
#include <rwsf/sdo/DataObject.h>
#include <rwsf/sdo/DataObjectList.h>
#include <rwsf/sdo/HelperProvider.h>
#include <rwsf/sdo/PropertyList.h>
#include <rwsf/sdo/XMLDataAccessService.h>
#include <rwsf/sdo/XSDHelper.h>

using namespace rwsf;
using namespace sdo;
using namespace std;

const char* SCHEMA_FILE = “../fpml-bond-option-4-4.xsd”;
const char* INSTANCE_DOC = “../bond-option.xml”;

int main()
{
  try
  {
    // Create an XSD Helper instance for working with XML schema
    DataFactoryPtr dataFactory = DataFactory::getDataFactory();
    XSDHelperPtr xsdHelper = HelperProvider::getXSDHelper(dataFactory);

    // Define Types and Properties from the XML Schema
    xsdHelper->defineFile(SCHEMA_FILE);

    // Load the XML instance document
    XMLDataAccessService das(dataFactory);
    DataGraphPtr dataGraph = das.loadFile(INSTANCE_DOC);

    // Create a root data object
    DataObjectPtr root = dataGraph->getRootObject();
    DataObjectPtr fpmlDoc = root->getDataObject(”FpML”);

    // Get bond details
    DataObjectPtr bond = fpmlDoc->getDataObject(”trade[1]/bondOption/bond”);
    std::cout << “Instrument Id = ” << bond->getCString(”instrumentId[1]”);
    std::cout << “Currency = ” << bond->getCString(”currency”);
    std::cout << “Coupon Rate = ” << bond->getCString(”couponRate”);
    std::cout << “Maturity = ” << bond->getCString(”maturity”);
    std::cout << “Par Value = ” << bond->getCString(”parValue”);
    std::cout << “Face Amount = ” << bond->getCString(”faceAmount”);

    // Modify the Coupon Rate from 0.014 to 0.015
    bond->setCString(”couponRate”, “0.015″);

    // Save the changes to a new XML instance document
    das.save(dataGraph, “bond-option.001.xml”);

    return 0;
  }
  catch (SDORuntimeException& e)
  {
    cout << e.why() << endl;
  }
}

Intel’s ‘Ct’

Wednesday, June 25th, 2008

Intel recently announced that they are working on a new programming language specifically designed for multi-core CPU hardware - called ‘Ct’. Ct is ‘C’ for throughput, and is essentially the C programming language with extensions.

It is similar in many ways to CUDA from nVidia and Brook+ from AMD, although Ct is for CPUs and CUDA & Brook+ are for GPUs (see earlier post re: GPUs). This is likely to be a good thing for software developers who are working on getting existing and yet-to-be-written software to scale appropriately on multi-core hardware.

Ct uses the combination of a compiler and runtime to take much of the burden of parallelism from the software developer. For example, the basic tasking unit is a ‘future’, which can be executed now or later and receives data consistency guarantees from the runtime. You can find details on how it will work on Intel’s site.

It does, however, highlight again the split that has occurred in hardware design - all vendors are going multi-core/multi-thread, but some are taking more of a homogeneous CPU approach, and some are taking a more heterogeneous GPU (accelerator) approach.

For software engineers, this means productivity challenges (”how do I get my existing code to run on GPUs, how do I get it to scale on multi-core CPUs”) as well as portability issues (”I don’t really want to maintain code written in CUDA, Brook+ and Ct, even though they are all variants of C”). This is all related to the Multi-core Dilemma that I have written about previously on the Intel Blog site and elsewhere.

Rogue Wave’s ‘Hydra’ product uses Service Parallelism to address the Multi-core Dilemma on CPUs, and we have worked with Intel a great deal on this, as it is complementary to Ct and other Intel technologies like TBB.

We are also working with both nVidia and AMD on Project “Gazelle” to address GPUs. “Gazelle” will generate optimized code for nVidia and AMD GPUs, and could do the same for Intel Ct in the future to ease migration for existing applications.

Pac-Man crashed the SIFMA 2008 party. His message: “I will help you save on your electricity bill”

Thursday, June 19th, 2008

June 10th, New York City: At about 100 degrees, this was probably one of the hottest spring days in 2008. The SIFMA (Securities Industry and Financial Markets Association) technology management exhibit was just opening up, and to keep all the suit-wearing businessmen cool, the hotel’s air conditioners were throwing many BTUs away…

This wasn’t without reminding me of the reason why I was there, standing at the AMD booth, demonstrating computation running on an AMD FireStream graphic card…

It all started almost a year ago, when most of our Wall Street customers asked us whether we could help with programming to GPU (Graphical Processing Unit), or most widely known as ‘accelerated graphics card’.

Their interest is pretty simple - financial computations require a LOT of computing power. And with a traditional CPU-based approach like a grid, a LOT of computing power requires a LOT of electrical power, which at the end of the day is lost in the air conditioning system…

It is foreseen than ‘accelerated computing’ based on hardware derived from what is commonly known as ‘graphics cards’ is the best chance to save a lot of those BTUs…

Why?

First, GPU can accelerate computations by a huge ratio.

Pac-Man was released in 1981, and by today’s standards, moving the yellow flat circle across the screen is no more considered a technological achievement. As a matter of fact, 3dfx Inc. revolutionized gaming in 1995 by introducing the first ‘consumer accelerated graphics card’, hence delivering ‘mind bending graphics’. Simultaneously, Microsoft introduced the first version of their DirectX API, now the leading reference in gaming development. And 13 years later, GPUs are now able to create three dimensional images more than 200 times faster than regular CPUs.

It is not a stretch to establish a parallel between the Black-Scholes paper (published in 1973 and introducing basic option pricing concepts) and Pac-Man. After all, they both created a new industry. But while the ‘accelerated computing’ revolution happened in 1995 for video-gaming, Financial and General Purpose accelerated computing is being revolutionized today.

AMD and nVIDIA are the first to introduce new dedicated cards that are no more limited to graphics and linear algebra only, but can also run full double-precision C-like programs on extremely large sets of data. Simultaneously, general purposes APIs are becoming available. And preliminary tests show 10X - 40X improvement for some applications. It is still a bit shy of the acceleration we are seeing in the graphics world, but remember we are talking 1st generation hardware.

Second, GPUs can save power.

Even at a 20X improvement, a single GPU offers performance of 20 CPU cores. And it consumes around less than 150Watts… And if you can picture it correctly, it is easy to compare the size of the 8 cores server I was using at the SIFMA show and the AMD FireStream card (about a fourth of a shoe box).

SIFMA 2008 was the perfect opportunity to confirm that ‘accelerated computing’ is the future. But, the overall feel remains that in most cases GPU development is slowed down by a still maturing industry both for APIs and hardware, but people are seriously investigating it.

And hedging the investment made in a single one of those new technologies is still the main concern, people being a bit reluctant to put all their eggs in the same basket.

As a matter of fact, lessons can be learned from history: When developing video games in 1995, the API of choice was Glide, actually published by the leading and only vendor in the accelerated 3D card market: 3dfx Inc. But “By 2000, the improved performance of Direct3D and OpenGL on the average personal computer, coupled with the huge variety of new 3D cards on the market, the widespread support of these standard APIs by the game developer community and the closure of 3dfx, would make Glide obsolete.” (source: Wikipedia)

I almost forgot! But that was really my reason to be there.

I have been helping in a lot of the parallel assessments for Wall Street and non-Wall Street customer to evaluate current implementations of CPU intensive and non-CPU intensive applications, to see how GPUs and other techniques like multi-threading and service grids can help improve throughputs and reduce latency of applications. And with the support of our development team, we can provide solutions to quickly evaluate, code and test GPU implementation (on multiple APIs) and on multi-core Technology.

You will be surprised at some of the results…

Why would SOA become the dominant architecture for software development?

Wednesday, June 18th, 2008

In a recent blog post, Alex Cameron with EDS talks about SOA becoming the dominant architecture for Software Development. I could definitely see how this could be true. It seems software development has progressed and chosen certain styles of programming languages for a reason. As Java and C++ instrumented separate implementation and interfaces, developers realized they could more easily use another developer’s work without having to know what was going on under the covers. Companies and managers saw that they could more efficiently manage and control large projects with various teams interacting with each other. It led to easier to understand software, more productive development teams, and even documenting the software became simpler as the interface was a great guide as to what the component did.

So what is the extension of that? Not only would that developer like to use someone else’s work without knowing anything about it, but they also want to have access to work done on other OSes, on different hardware, in different languages, and all without having to understand the details. So the previous model of finding a .h file or some other class description in the appropriate programming language would be replaced by a search of WSDLs for the functionality needed. No longer would the developer be limited by language, platform, or in some cases, even geography or affiliation.

Rogue Wave / AMD partnership for Multi-core CPU and GPU

Wednesday, June 11th, 2008

Expansion of our Relationship

Rogue Wave and AMD have a long-standing partnership to advance C++ software development on AMD’s Opteron CPU platform. I’m excited that our two companies have recently announced an expansion of that relationship to make it easier for software applications to take advantage of the additional computing power available on multi-core CPUs and on GPUs (graphical processing units).

For several years, increased performance from all hardware vendors has largely come from additional “cores” instead of faster clock speeds. This provided significant additional processing power, but most existing software doesn’t adequately take advantage without significant modification. This is called the “Multi-core Dilemma”.

Challenge and Opportunity

The Multi-core Dilemma is both a challenge and an opportunity that will increase rapidly as the number of cores and threads continues to increase. A typical GPU already has 128 threads. For applications that lend themselves to parallel processing, this can mean a significant gain in throughput.

Although GPUs have the potential for even greater processing power than their CPU counterparts (for certain applications), there are additional challenges as well:
1. Developer productivity - use of the software tools requires special training.
2. Portability - software written for GPUs will not run on other GPUs or on CPUs.

Our partnership is designed to address both of these issues, and to close the gap between hardware and software that has been widening over the past few years.

Although both companies are committed to broadly applicable solutions, our initial focus is on the financial services industry, where much of the activity is already happening.

What are your experiences with multi-core CPU and / or GPU? Please post a response with your thoughts.

Matrix multiply in parallel - is a different result ok?

Monday, June 9th, 2008

When moving a production application from one system to another, extensive testing is generally done to ensure, among other things, that results from the new systems agree with expected results from the old system. This is true whether changing operating systems, hardware, or anything else.

For example, many financial services firms have moved from Unix systems to Linux for a variety of good reasons. When moving quantitative analysis applications, they had to verify - to multiple significant digits - that calculations done on Linux would not be different from what they got in the old system.

Different is not always wrong - sometimes a different new result is “more correct” - but it takes effort and time to verify that and make sure.

Now many companies are moving from sequential processing to parallel processing. This can actually be a bit trickier. Certain mathematical algorithms calculate differently in a parallel environment than in a sequential environment. This may not have anything to do with the implementation, it is often just the nature of the numbers.

Matrix multiplication is an example of this. Since matrix multiplication is not commutative in most cases, multiplying a matrix in parallel can result in a different outcome because the multiplication and subsequent addition is necessarily done in a different order.

Here is an example (thanks to David Haney):

Given two 4 x 4 matrices (A and B), you would normally calculate the result in 0,0 as:


(A00 * B00) + (A01 * B10) + (A02 * B20) + (A03 * B03)

If you change the order of operations though, like the following (note the parens):


((A00 * B00) + (A01 * B10)) + ((A02 * B20) + (A03 * B03))

Then you might see different results, depending on how the floating point rounding turns out. You probably won’t see much skew at this scale (especially if all of the numbers are roughly the same magnitude), but if you were dealing with an 1024 x 1024 matrix, you would probably start seeing some variation.

There are some algorithms for breaking up a matrix multiply that allow you to maintain equivalent results to sequential, but still at least partially execute the code in parallel, but from what we’ve seen those methods look like they’re less efficient than algorithms that do some amount of reordering.

The outcome, although different, may not be any less “correct”. But that difference may have business consequences that need to be planned for. Regardless of the software programming model and technology used to go parallel, this is something to be mindful of.

Release at Any Time: the Documentation Perspective

Tuesday, June 3rd, 2008

At Rogue Wave, the trend is towards agile development, with frequent releases of new features between major product releases. To this end, we maintain an impressive infrastructure of nightly automated testing of a large code base across a daunting number of platform, compiler, and database combinations. The system includes extensive reporting of test results against code quality baselines, regression analysis, and ongoing fixing of priority bugs. The goal is to maintain the code base at a high level of quality such that we can release at any time with confidence.

As a documentation person, the good news is that Rogue Wave has always valued documentation highly, and considers good documentation an important part of the product. The challenge is that documentation must therefore strive for the same level of consistent, release-at-any-time quality.

== Getting There with the Process Automation ==

When I realized that the documentation team either needed to match the agility and automation of the development teams or risk becoming less relevant, I could take comfort in the fact that documentation already had in place considerable process automation. For many years at Rogue Wave, a conversion architecture has supported the ability to reconvert FrameMaker source documents into the release formats easily and at any time. An added feature of this process was extensive reporting of formatting and linking problems found during the conversions.

The first step was to create infrastructure to support automated nightly conversion runs. The biggest obstacle was automating up-to-date PDF creation, the one main distribution format that was still created manually. A utility called FrameScript was the solution to that problem. With a little more creative jiggling, we reached a state where all documentation could be converted nightly, and the conversion error reporting neatly summarized on a single point of access Web page.

So far so good, but what customers expect to see is not an amorphous bundle of document files. They expect a well-formed document distribution, with convenient access points to the information. So we next devised a process for defining a manifest of everything that needed to be in a given product distribution, and a script to act on that manifest to create document distributions exactly as we expected to deliver to customers. Naturally we added some testing, too, resulting in a nightly distribution quality report.

== Document Health ==

All well and good, but all of this automation counts for very little without a commitment and a process to maintain good document quality — what we choose to call document health. Scripts are very non-judgemental, which is the inspiration for the old saying about the consequences of feeding them garbage. So while we in documentation were emulating the automated processes of our development colleagues, we were also adopting their scrum-based agile methodology. As they work on a feature, we work beside them on its documentation. Critically, we also continually monitor the reports that come out of our nightly automation, and attempt to keep the errors at or near zero. This works quite well with the incremental changes expected with an established, stable product, not quite so well with the major revisions and refactorings that are the inevitable burden with dynamically changing newer products.

Even if the picture is sometimes less than completely rosy, there is no arguing with the vision. When it is going well, this approach gloriously meets its intended goal. The document distributions that are created each night are exactly the documentation we intend to release. If the document set is reasonably stable and we are on top of the errors, we truly can on any given day publish a document distribution to release engineering and be proud to have it given to our customers.

It doesn’t get any better than that.

Life after CORBA

Tuesday, June 3rd, 2008

I have been involved in distributed computing for a number of years, and recognize that Service Oriented Architecture is just another approach to getting distributed applications to work together. Previous generations include things like rpc’s, DCE (http://en.wikipedia.org/wiki/Distributed_computing_environment), CORBA (http://www.omg.org/) etc. The advantage of SOA lies in fact that the underlying standards, ie XML, SOAP etc, are broadly accepted across the industry, so interoperability between vendor products is much more real now than it has ever been previously.

I happen to have a good deal of experience in the CORBA world, having worked for Visigenic Software both before and after it was acquired by Borland. CORBA was an effective tool for connecting distributed objects, providing both language and platform neutrality. This was true so long as your platform was not Windows, because then you had to deal with COM/DCOM and the world of COM/CORBA bridges. This split between Microsoft and the rest of the world was a key issue that ultimately limited the proliferation of CORBA, but not before it was broadly adopted, particularly in the Telco and Financial Services industries. You still see many implementations of CORBA in what are now referred to as legacy applications, but not as much in newly developed systems.

Many of our SourcePro C++ customers also use CORBA orbs, most often Orbix. What we are finding is that many customers have applications that use older versions of Orbix that are no longer supported, and yet they continue to pay significant maintenance fees on those licenses. One customer explained that they feel they are at risk every time they touch the application, because if something breaks, they have no good avenue to seek help. This is not the ideal that IT strives for, ie it is both expensive and risky. The good news is that for many customers, there is a better alternative that it easy to put in place.

In many cases, orbs were used essentially as a communications mechanism between remote applications, maybe handling the mediation between C++ and Java applications. Today, this problem can easily be solved using a Web services approach. Rogue Wave has a product known as HydraExpress that has the capability to easily turn a C++ application into a service. For CORBA users the paradigm is familiar. This product can take WSDL (remember IDL?) as an input and generate stubs and skeletons for a Web services client or server. There are open source tools http://search.cpan.org/~perrad/CORBA-XMLSchemas-0.41/idl2wsdl.pl) that help you to convert IDL to WSDL, which is the key step in the process. It is not always that simple, but often it is darn close. Once complete, you have an application based upon modern standards that gives you more flexibility, less risk and at significantly less cost. Sounds pretty good to me.