Comments
yourfanat wrote: I am using another tool for Oracle developers - dbForge Studio for Oracle. This IDE has lots of usefull features, among them: oracle designer, code competion and formatter, query builder, debugger, profiler, erxport/import, reports and many others. The latest version supports Oracle 12C. More information here.
Cloud Expo on Google News
SYS-CON.TV
Cloud Expo & Virtualization 2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Cloud Computing & Enterprise IT: Cost & Operational Benefits
How and Why is a Flexible IT Infrastructure the Key To the Future?
Click For 2008 West
Event Webcasts
Getting Automation Right with Big Data | @BigDataExpo #BigData
Things To Remember While Automating With Big Data

Big data automation can mean writing dozens of scripts to process different input sources and aligning them in order to consolidate all this data and produce the required output.

Why exactly do you need big data for your enterprise projects? Many industry observers have been noting that although a lot of enterprises like to claim that their big data projects are aimed at "deriving insights" that replace human intuition with data-driven alternatives, in reality though, the objective appears to be automation. They point out that the role of data scientists at a lot of organizations has got little to do with replacing human intuition with big data. Instead, it is about augmenting human experience by making it easier, faster and more efficient.

But automating big data processing is easier said than done and the biggest problem here is that big data is well big. What this means is that there is a lot of chaos and inconsistency in the data available. As a result, creating a MapReduce script that can instantly input all your data and process the results is just wishful thinking. In reality, big data automation can mean writing dozens of scripts to process different input sources and aligning them in order to consolidate all this data and produce the required output.

The first thing to get right with respect to automating big data is the architecture. One of the most popular ways to set up big data automation is through data lakes. To put it simple, data lakes is a large storage repository that holds all the raw data until it is necessary for processing. Unlike traditional hierarchical data warehouses, data lakes stores raw data in a flat architecture . One of the key advantages here is that data lakes can store all sorts of data - structured, semi-structured and unstructured and is thus ably suited for big data automation.

The next thing to get right is agility. Traditional data sources are structured and using a data warehouse technology ensures seamless processing and efficient processing of data. With big data though, this can be a disadvantage. Data scientists need to build agile systems that can be easily configured and reworked in order to quickly and efficiently navigate through the multitude of data sources and build an automation system that works.

While challenges as those mentioned above can be tackled by choosing the right technologies, there are other problems with big data that need to be dealt at a more granular level. One example is manipulative algorithms that can bring about vastly different outputs and rogue or incompetent developers can cause automation issues that can be extremely difficult to track down and modify. Another issue is with misinterpretation of data. An automated big data system could possibly magnify minor discrepancies in data and feed them into a loop that could lead to grossly misleading outputs.

These are issues that cannot be wished away and the only way to get automation right in such cases is by diligently monitoring and evaluating the code and outputs. This way, it is possible to identify discrepancies in the algorithm and outputs before it can potentially blow up. From a business perspective, this means additional resources to test and validate the code and output at each stage of the development and operational cycle. This could effectively bring down the cost advantage that big automation has. But this is a necessary expense to pay if businesses need to establish a sustainable big automation product that also works.

About Harry Trott
Harry Trott is an IT consultant from Perth, WA. He is currently working on a long term project in Bangalore, India. Harry has over 7 years of work experience on cloud and networking based projects. He is also working on a SaaS based startup which is currently in stealth mode.

Latest Cloud Developer Stories
"Storpool does only block-level storage so we do one thing extremely well. The growth in data is what drives the move to software-defined technologies in general and software-defined storage," explained Boyan Ivanov, CEO and co-founder at StorPool, in this SYS-CON.tv interview at...
A strange thing is happening along the way to the Internet of Things, namely far too many devices to work with and manage. It has become clear that we'll need much higher efficiency user experiences that can allow us to more easily and scalably work with the thousands of devices ...
As DevOps methodologies expand their reach across the enterprise, organizations face the daunting challenge of adapting related cloud strategies to ensure optimal alignment, from managing complexity to ensuring proper governance. How can culture, automation, legacy apps and even ...
As Marc Andreessen says software is eating the world. Everything is rapidly moving toward being software-defined – from our phones and cars through our washing machines to the datacenter. However, there are larger challenges when implementing software defined on a larger scale - ...
Blockchain. A day doesn’t seem to go by without seeing articles and discussions about the technology. According to PwC executive Seamus Cushley, approximately $1.4B has been invested in blockchain just last year. In Gartner’s recent hype cycle for emerging technologies, blockchai...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021



SYS-CON Featured Whitepapers
ADS BY GOOGLE