Mark Dwyer

Mark Dwyer


High Performance Computing Programmer and DATA Specialist

Personal Info

Professional Profile

I live for solving difficult, real-world problems using mathematical principles and solid computing programming techniques. I believe most phenomenon can be accurately modelled mathematically and realistically simulated in silico. I am seriously passionate about innovation and thrive in environments where cutting-edge business-related research and development is being undertaken.

I have 12 years experience in software research and development for next-generation, extreme-scale high-performance computing (HPC) systems and massive scale data analytic and throughput methodologies.  I apply these skills learned in every task I have put in front of me.

Work Experience

January 2014 - Present

Founder/Owner

Shearspace Pty Ltd


I am now getting very close to a version 1.0 release of my software. It has taken quite a lot of effort to get it to this stage and has required me to solve quite a lot of very difficult issues along the way (you should see the performance of the new ray-box intersection algorithm I came up with – realtime graphics now available on a cpu). I’m currently filing global patent applications for quite a few things (on the advice of a few interesting people).

Shearspace provides a simple way to perform massive scale, real-time visualisation of datasets all within a HTML browser (no extensions, no WebGL, no GPU, purely native browser). It can deliver FullHD (1080p) resolution at 30 frames per second for total interaction of monster sized datasets. The nature of Shearspace also enables people to share their datasets with stakeholders but not compromise their data security. That is, if you own sensitive data, you can allow specific people to view that data without ever having to give them a physical copy. This helps on two main fronts: data security and the difficulties in moving very large datasets around. You can set a variety of rules regarding the extent stakeholders may view data.

Shearspace loads data files natively as it does not require specialised data formats. You can thank years of HPC performance programming for that one. Data files can be stored in AWS S3 at the standard pricing model (typically 3 cents per GB per month). As a result, you are not limited to just visualising a single data type; You can view RGB, Intensity, point returns, classification, scan angles, height; The full range of data. These data types can be manipulated with a user defined 1 dimensional transfer function for interactive colourmaping.

Mobile Laser Scanning Interactive colourmaps

The renderer itself is capable of drawing in excess of 12 billion points per second (single core). This performance will scale proportionally as modern cpu architectures progress as well as being multi-core (but I only use one core for the rendering at this point because that is all I need). Of course, given the remote visualisation nature of the software and that the rendering resources can be located thousands of kilometres away, rendering speed is the least of the problems. I think that the lag, bandwidth and other network problems have been solved nicely. Higher geometric primitives are also ready but not flagged for version 1.0 at this stage.

People will no longer have to purchase expensive graphics workstations and deal with expensive, and restrictive, software licenses that require yearly maintenance. Shearspace is available on demand with a very cheap pay-by-the-hour model. This enables clients to cost optimise their IT infrastructure.

I’m not sure yet on the absolute release date into the global AWS cloud as I’m still stress testing the performance of encryption and machine scaling (I have to ensure that my load balancer/monitor can handle up to at least 10,000 AWS instances, globally located, seamlessly).

If anybody would like to have a look and play with a pre-release version, just email me at the contact details at the bottom of the Shearspace website. I’d like to thank QCIF and QRIScloud for providing me with resources to enable client testing.

UltraHD Movie Production Height Maps: What do you see?

I’ve had a lot of fun writing this software. It is really nice to devote fifteen months of effort into an idea of remote visualisation that I’ve been mentally cooking up since 2003 (my first IEEE remote visualisation paper, on mobile devices before smart phones existed!) and utilises everything I’ve learnt during my education and working career (20 years in total).

Shearspace website

Flagged for version 2.0 release is the point cloud classification algorithm. Shearspace has been built with a scalable, parallel, computational engine that enables highly computational algorithms to be applied to the point cloud. This is something that vendors cannot do with even powerful workstations. These algorithms require potentially 100’s of compute hours to deliver accurate results. Shearspace allows users to scale up their computational requirements simply, cheaply and on demand. I hope to also develop the ability to extend into Amazon spot pricing (100 compute core hours for less than a dollar). But my classification engine will take next priority.


Update on LiDAR Automatic Classification (31/08/2015)

MLS with Road Noise (traffic) MLS with Automatic Noise Removal
[Click for Slideshow]

July 2012 - December 2013

Big Data Specialist

Roames (Ergon Energy)


I'm responsible for leading the research and development associated with providing solutions for dealing with massive scale and complex spatial datasets. I apply strong mathematical and computer science skills and methodologies in the areas of data storage, feature extraction, analysis, visualisation and simulation that enables Roames to deliver scalable and optimised technological solutions.

  • conduct and co-ordinate the research agenda necessary for Roames to manage very large and complex spatial datasets by applying expert knowledge and the sound understanding of mathematics and computer science
  • Identify, analyse and model new approaches in the area of 'Big Data' including the fusion of machine learning, computer science and algorithms
  • Recommend and assist with the development of approaches and technology solutions that will improve the Roames technology platform particularly in the areas of data storage, feature extraction, analysis, visualisation and simulation.
  • Ensure the research and development agenda is aligned with Roames priorities and opportunities.

Key Deliverables

  • Migrated Roames' software development and data processing pipeline from virtualised Windows infrastructure to Linux (reduced cost, improved throughput and enhanced system robustness).
  • Architect and built Roames a robust, data processing compute cluster out of defunct hardware and delivered data processing pipelines from the existing 22 hours per 2TiB to 30 minutes (Average simultaneous processing). Roames now processes data faster than the planes can capture the data, in parallel.
  • Took control of Roames backup strategy (peta-scale).  Recovery testing revealed that a 1TB mission (unit of data) recall took longer than 1 month using existing commercial backup software. Hardware architecture redesign, simple software development and utilising the existing tape robot and defunct hardware so that data is automatically pushed to dual tapes and a 1TB recall now takes 124 minutes (theoretical minimal time for LTO5).
  • Backup recovery was essential to master since it was impossible for all Roames data to exist on spinning disks.  There was too much data with the potential for data growth outgrowing the ability  to add more spinning disks.  Since software bugs and 'accidents' are inevitable in the development pipeline, it was necessary to master data backup such that should such an event occur, the impact on business is minimal.
  • Rewrote existing commercial code (RiANALYZE) that cost the business a large sum annually in licensing costs and deployed to the cluster for automated parallel processing (removing the need for human button clicking).  Nanosecond precision, high intensity wave analysis code for point cloud extraction from reverse engineered dual laser, aircraft mounted instrumentation.  The code is 10x faster (code optimisation and data movement) than the vendor specific program and extracted more points (and zero noise) due to robust mathematical techniques.  The code has since been enhanced by mathematicians with specialist skills in wave analysis (modular software engineering).  Roames will continue to improve this code for as long as Roames is in business.  It is now critical software Intellectual Property for Roames (C/C++).
  • Implemented seekable compression and decompression algorithms so that Roames' existing data and network fabric could move extreme scale data around on a traditional corporate infrastructure.  The algorithm allows native spinning disk seeking and the ability to stream data into and out of the compressed format (C++).  I'm seeking permission to open source this library.
  • Liaised with hardware vendors to architect new compute and data infrastructure more suitable to Roames' massive scale data and compute processing needs.  Roames' now have the blueprint, the knowledge and most skills required to scale their data processing by many orders of magnitude should the need arise.
  • Tuned critical portions of compute intensive codes to make use of CPU SSE and AVX instruction sets by applying the 80/20 rule as required (C/C++)
  • Investigated moving Roames' infrastructure into the Amazon Cloud.  Have written a web-based program that automatically creates a compute cluster and software layer identical to the current compute cluster above and configured for as larger scale data processing possible in Amazon with a click of a single button (Java API).
  • Using Roames (wonderful) data, I have applied it to my existing rendering algorithms that I've been working on for a long time.
BrisbaneCBDLidar
500 million, double precision 3D points of Brisbane CBD created by automatic extraction of waveform laser data [click to enlarge]

Feb 2005 - July 2012

Research Support Specialist

Queensland University of Technology


My job was to liaise, consult with and support QUT researchers across a range of areas relating to the application and support of high performance computing infrastructure to research problems. This included practical support to researchers and research projects including consultation and advice, specialist training, and tasks relating to the management and development of the IT research environment, including virtualised services. Support for both compute-intensive and data-intensive research codes requiring scientific code optimisation was a focus for this role (e.g., memory management, code optimisation and parallelisation)

Primary Responsibilities

Provided research support to QUT institutes, faculties and individual researchers by:

  • providing advice to researchers on how to integrate and make best use of information technology and available IT services that support research;
  • providing an interface between Information Technology Services and University researchers for technology and/or research support matters;
  • establishing close links with research centres and individual researchers via liaison and other outreach programmes;

Supported researchers involved in the development of high performance research codes and applications (eg compute- and data-intensive applications) on infrastructure available by having:

  • provided code optimisation and parallelisation consultation services;
  • provided a consultancy service on optimal use of computational hardware (both compute and data storage) and software resources;
  • providing assistance and advice on the porting of scientific codes to high performance research infrastructure;
  • analysing research codes to identify and correct performance problems and bottlenecks;
  • developing and delivering technical training on the use of application development and optimisation tools and the optimal use of computational resources (both hardware and software);

Provide consultancy and support of high performance computing tools, applications and scientific libraries by:

  • solving researcher problems and providing technical support;
  • installing and maintaining research tools and applications;
  • developing research support documentation;
  • developing, maintaining and running technical training courses for researchers;

Technical Skills:
  • Architect, maintain and administer a 100 node, 1500 core hybrid compute architecture cluster
  • Development and optimisation of scalable, parallel and serial computer codes (Fortran/C/C++/Matlab/Java/R) for various clients and stakeholders (Numerical Methods, Artificial Intelligence, Machine Learning, etc)
  • Development of algorithms to enable research in multi-disiplinary areas across all Science, Technology, Engineering and Mathematics.
  • Development of scalable, parallel graphics libraries for realtime, interactive visualisation of multi-terabyte datasets
  • Management and support of high-performance workstations and parallel supercomputing environments: SGI Altix XE Cluster (include NVIDIA and FPGA Hardware), SGI Altix 4700, IBM eCluster 1350, SGI Origin 3000 and Linux / OSX workstations
  • Consult and assist state- and nationally-based agencies: Queensland Cyber-Infrastructure Foundation and The Australia Partnership for Advanced Computing
  • Delivery of training courses in computation
Algorithm Development:
  • Linear Algebra: Gauss-Jordan, Gaussian Elimination, LU Decomposition (core, node, rack, gpu), Banded Systems, SVD, Cholesky, QR Decomposition
  • Interpolation and Extrapolation: Polynomial, Cubic Spline, Rational Functions, Grid and Scattered Data
  • Random Numbers: Parallel random number generation being particularly difficult
  • Sorting and Selection: Straight, Shell, Quick, Heap, combinations of selections
  • Root Finding: Secant, Newton, NR
  • Function Minimisation and Maximisation: Bracketing, Parabolic, Brent's, Downhill Simplex, Powell's, CG Method
  • Eigensystems: Jacobi, Hermitian, QR, Eigenvalues and Eigenvectors
  • Fourier: FFT, DFT, Real, multi-dimensional, Convolution and Deconvolution
  • Data Modelling: Least squares, nonlinear models, Markov, Gaussion, Clustering, Support Vector Machines, Bayesian
  • Computational Geometry (my favourite): *

Feb 2003 - Feb 2005

Research Assistant (Mathematics)

Queensland University of Technology


  • Development of a large-scale three dimensional computational model for simulating the evolution of the saltwater fronts in Gooburrum (near Bundaberg)
  • Development of automatic hybrid 3D volumetric mesh generation for numerical simulation.
  • Visualisation of real time interactive 3D unstructured unsteady flow.
  • Developed high performance computing tools for parallelisation of the computation model to provide fast computational times for the simulations performed for this region.
  • Real-time interactive visualisation of mesh and “jacobian free”, Krylov subspace iterative methods in an immersive stereoscopic environment.

Technical Skills

Expert, 10 years

Supercomputing


Optimisation and parallelisation of numerically intensive codes, scaling to multi-rack, multi-network compute clusters. Parallelisation strategies range from simple vectorisation (full utilisation of SSE instructions) to full implementation of socket level programming for communication via numa or infiniband coupled with pThread implementations.

I have ported/tuned/parallelised more than 650 floating point intensive codes in the last seven years. Some projects, engineering simulation for example, I have had 800x speedup after 45 minutes of code inspection. Another example for Computational Statistics was to reduce runtime from 20 years to 24 hours (7,300x speedup).

Cache
Randomly Selected Month of Savings Achieved by Code Optimisation

The types of codes include, but not limited to, Econometrics, Non-linear Algebra, Computational Statistics, Information Systems, Particle Physics, Bio-Mechanical Engineering, BioInfomatics, Business, Geographic Systems, Accountancy, Electrical Engineering, Structural Engineering, Geology and Mining.

Cache SIMD Cores
Measuring Cache Latency
SIMD Registers
Core/Cache Locality

Expert, 11 years

Data Processing


Highly capable of processing extremely large datasets (3 Petabyte compressed dataset largest so far) in theoretically minimal elapsed times. That includes development times. Design and architecture of tailored or "off-the-shelf" hardware using zfs, nfs, ext4, xfs filesystems.

Worldscope (including automatic update), LiDAR (GIS, extreme scale point clouds), 10 year Australian Electricity spot prices, Wikipedia corpus, World Intellectual Property Organization Patent Applications, Synchrotron, are but a very few examples.

Lidar
Automatic software pipeline to manage data processing workflow on a compute cluster.

Expert, 13 years

Visualisation


While dealing with large data, it becomes necessary to examine data, dataflow and datastructure in meaningful ways to reveal hidden structures not apparent with algorithmic analysis. I've always enjoyed employing numerous techniques to generating scientific visualisations and am quite passionate about it. I am very well versed with GTK, GLX, OpenGL, SDL and VTK programming libraries for generating spectacular images, videos and interactive programs on a plethora of viewing environments. In my spare time, I even work on my own visualisation systems.

Cache
Interactive Native Megapixel Displays of Massive Scale Data (Own software libraries)
Cache
Interactive, Large Scale Volumes

Languages

Expert, 18 years

C/C++


Highly proficient down to bit manipulation level. Parallelisation using any technologies but mostly versed with MPI/OpenMP/pThreads/TBB/IPP. Preference for Intel compiler (on Intel hardware) and GNU compilers. Very familiar with profiling tools; VTune (amplxe-xxx)/gprof/codecov/valgrind/inspxe-xxx/etc.

Well versed on CPU architectures (past, present and future), including the number and types of registers available and how to program to use them without resorting to assembly code. This method enables programs to scale to new processor architectures without having to rewrite code.

Have delivered many training courses over the years on scientific programming with C++ and 'bare metal' programming.  These training courses have inspired and encouraged programmers new to the challenges of the "Big Three": Big Compute, Big Data and Big Visualisation.

Have developed numerous algorithms in all areas of Science, Technology, Engineering and Mathematics.  These algorithms have required writing from the ground up since most available libraries are inefficient, resource hogging and don't take advantage in recent cpu advances.  Total design of data structures to be as light weight as possible yet extensible to take advantage of various cache line sizes and cpu registers.

Proficient, 5 years

Fortran


Strong exposure to Fortran due to 'old school' Professorial academics. Experience with tuning, parallelising and writing checkpointing code for those simulations that can run for years.

Expert, 17 years

Matlab


Intimate knowledge of Matlab and its toolboxes. Frequently ported codes to make use Mex files, the Matlab compiler or the parallel and distributed toolboxes (including GPU implementations). When performing exploratory algorithm selection/implementation I turn to Matlab simply due to increased speed of development.

I frequently blog about using Matlab, Mex files and the Intel Compiler and Intel Libraries and tools.

Expert, 15 years

Bash


Being a Linux user, I primarily turn to bash (sed, awk, make, etc) and its abundance of command line tools to achieve quick turn around times for project delivery. Bash (or command line scripting) is often overlooked as a way to achieve goals in quick yet reproducible ways.

Proficient, 15 years

Perl


I've used Perl extensively over the years to perform all sorts of tasks.  I have found it particularly useful using it to talk via a web interface to underlying cluster submission systems. Although, not computational, Perl provides many advantages in terms of quick analysis and processing of input data. Advantages of perl are that it is easy to get a client to use since portability is nearly universal.

Proficient, 16 years

Java


Not an ideal language choice for computational programming but have used Java extensively for web delivery of services and simple system monitoring tools (It should be noted that these language advantages have been largely superseded by HTML5/JS/CSS programming. Servlets, Graphics, Mobile Devices, are some programs that have written and delivered for clients/projects.

Proficient, 9 years

HTML5/JS


This is largely a new paradigm but it is easy to see how powerful HTML5 will be in delivering services and results, particularly computational results. Some of the biggest problems with delivering services have been how to make a program easy to use by those with not so great computer literacy. HTML5 is a way to hide difficult OS particularities and allow easy communication with a back end service (on a supercomputer or some other powerful/appropriate hardware) to deliver the final result.

I have begun to port many existing services to use the full capabilities of HTML5, particularly the Worldscope database, other financial databases and GIS database frontends. It is also quick and easy to provide simple visualisation services (particularly information visualisation) to these existing services for quick perusal of data.

Proficient

*


There are many other languages that I know but cannot even recall using in the last 6 months. I have listed the main languages above that achieve most things that are required to be solved using computer languages. I've regularly worked with R, Python, Ruby, LISP, C#, Basic, VB (and their associated libraries and frameworks) to name but a few. I have even solved problems for people in languages that I've never heard of nor can I recall the name. Programming is about a state of mind and a way of thinking; Not about the language.  This is why it is referred to as 'The Art of Computer Programming'

Education

Queensland University of Technology

Bachelor of Science (Mathematics)


Studied advanced calculus, linear algebra, vectors and matrices, computational mathematics, differential equations. Specialised in applied mathematics, scientific computation and visualisation.

Bachelor of Information Technology


Majored in software engineering. Studied data warehousing, digital environments, enterprise systems, network systems and web technologies.

Stanthorpe State High School

OP 1 (1995)



-->