Tuesday, 22 May 2018

Processing close

People sometimes write code that is very assertive about this or that piece of a system. In a more suspicious case, many parts.
This typically isn't going to yield performant code - something will often try to slow your job down.

A specific advise:

0) Keep (simple) processing close to the database.


What I mean by simple is something that easily could be translated to a few machine level instructions (e.g.: sum) in each iteration, is done in a few iterations (e.g. 1), doesn't operate on a large dataset in each iteration.

Let's generalize it:

1) Keep avoiding to cross boundaries between system parts unnecessarily.


Or: try to consider alternatives where fewer communication routes are involved.

Such means of system boundary crossing to be avoid could be anything really, depending on the scale of operation, and at what magnification we consider the system:
  • Querying data from a server
  • Calling into a DLL/library
  • Exchanging information with a service/daemon (inter-process - but also, probably with a smaller overhead, synchronized inter-thread communication).
  • Looking up data from another table (one implication: look for unnecessary JOINs, maybe consider denormalization, esp. in a NoSQL setting)
  • Calling Java bytecode from a compiled binary
  • Calling another function (remember there's a cost of leaving return information on top of the stack)
  • Blocking on-screen confirmation with the user

 

A major exception: if the overall complexity is high.


In this case it may be worth turning to some hard/software dedicated to the processing - using a GPU, an SSD, scaling up, a cluster etc. a few terms to think of as opposed to CPU/FPU, HDD, your regular hardware, single compute node. And this will all involve moving data from one part of the system, less suited for the processing task, to another, more feasible, possibly more dedicated, potentially a temporarily allocated resource from a cloud.

However, it's also worth noting:

 

faulty and or critical system components may cry for redundancy


Such as - people. And then there come the desirable boundaries - but also come peer insight. Whether the superior efficiency of the one-man teams is a myth... well, while we'd like to believe in myths, we do know, that may not always be the best that can happen. You may need very proficient and disciplined people for that to work out - these days, with an expanding IT industry, years of experience on average is going plummeting.
Stumbling upon them could be way more the exception than the norm.

Thursday, 17 May 2018

A quick reminder to self about error handling in R (baby steps #1)

I guess the below code tells pretty much it all:

tryCatch({
  x = function() {
    stop("hahaha now you blew the code!")
  }
  y = function() {
    x()
  }
  y()
}, error=function(e){
  print(e$call)
  print(e$message)
})


And then sourcing it gives:

x()
[1] "hahaha now you blew the code!"


I guess that's all that there is in practice...
... except maybe that there's a finally parameter included which I never notice :

function (expr, ..., finally) 

So... worth a second look :)

(To be continued...)

Monday, 7 May 2018

JavaScript versus integer sequence

Looks like the ever feared, popularly hated JS really has its drawbacks ... you need functional programming to create an integer sequence? you need a loop? and wouldn't anybody want to just do something about it? :)

E.g.
https://stackoverflow.com/questions/3895478/does-javascript-have-a-method-like-range-to-generate-a-range-within-the-supp

Jeez :) time to realize how good Python and R are... well, R as a language only in certain aspects, but still.