High performance lookup table in Fortran -


part of simulation code requires me find opacity given density , temperature. there no analytic relation this; standard method use 2d array opacity(i,j) correspond opacity density(i) , temperature(j), , run bilinear interpolation find exact opacity.

this bottleneck in group's code - each time step, interpolation routine called 100 million times different densities , temperatures, , accounts 20% of runtime. current code shown below - there tricks use improve it? use intel fortran 16, options -o3 -xavx -mcmodel=medium

function smoothopc(den, temp, ig, opc, rhot)      implicit none     real(kind=8), intent(in) :: den,temp      integer,intent(in) :: ig     real(kind=8), intent(in), dimension(1:50, 1:50, 1:52) :: opc     real(kind=8), intent(in), dimension(1:50, 1:2) :: rhot      real(kind=8) :: rho, te, smoothopc, r1, r2, t1, t2, &             interpolation, denominator, a, b, c, d, opc11, &             opc12, opc21, opc22, t, tp, r, rp      integer :: rid,tid,i      rho = den * 1d - 3          !g/cc     te = temp / (1.6d - 19)     !ev     tid = -1     rid = -1     = 1, 49         r = rhot(i, 1)         rp = rhot(i + 1, 1)         t = rhot(i, 2)         tp = rhot(i + 1, 2)         if (rho .ge. r)             rid =         endif          if (te .ge. t)             tid =         endif     enddo      r1 = rhot(rid, 1)     r2 = rhot(rid + 1, 1)     t1 = rhot(tid, 2)     t2 = rhot(tid + 1, 2)     opc11 = opc(rid, tid, ig + 4)     opc12 = opc(rid, tid + 1, ig + 4)     opc21 = opc(rid + 1, tid, ig + 4)     opc22 = opc(rid + 1, tid + 1, ig + 4)      denominator = (r2 - r1) * (t2 - t1)     = r2 - rho     b = rho - r1     c = t2 - te     d = te - t1      interpolation = * (c * opc11 + d * opc12) + b * &             (c * opc21 + d * opc22)      smoothopc = interpolation / denominator      return  end function smoothopc  

if understand correctly, doing loop through whole of rhot find indices use afterwards find values in opc.

if columns of rhot sorted, may faster write binary search instead (there overhead, have test).

also, don't quite understand condition assign rid , tid (it seem logical assign rid if r <= rho < rp). may missing how rhot built.


a trick might try: convert rho*m integer, m may instance power of 2 or 10 (multiply 1000 3 digits of precision). value rounded integer index array elements correct (or nearest) values of rid. if not correct rid, may have smaller range check. if scale not linear, may transform rho first.


another possible trick: store previous rid , tid indices. if calls follow rather continuous evolution, new indices near previous ones. if code has paralellized @ point, it's not idea introduces sequential dependences between calls.


Comments

Popular posts from this blog

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

vue.js - Create hooks for automated testing -

Add new key value to json node in java -