High performance lookup table in Fortran -

March 15, 2010

part of simulation code requires me find opacity given density , temperature. there no analytic relation this; standard method use 2d array opacity(i,j) correspond opacity density(i) , temperature(j), , run bilinear interpolation find exact opacity.

this bottleneck in group's code - each time step, interpolation routine called 100 million times different densities , temperatures, , accounts 20% of runtime. current code shown below - there tricks use improve it? use intel fortran 16, options -o3 -xavx -mcmodel=medium

function smoothopc(den, temp, ig, opc, rhot)      implicit none     real(kind=8), intent(in) :: den,temp      integer,intent(in) :: ig     real(kind=8), intent(in), dimension(1:50, 1:50, 1:52) :: opc     real(kind=8), intent(in), dimension(1:50, 1:2) :: rhot      real(kind=8) :: rho, te, smoothopc, r1, r2, t1, t2, &             interpolation, denominator, a, b, c, d, opc11, &             opc12, opc21, opc22, t, tp, r, rp      integer :: rid,tid,i      rho = den * 1d - 3          !g/cc     te = temp / (1.6d - 19)     !ev     tid = -1     rid = -1     = 1, 49         r = rhot(i, 1)         rp = rhot(i + 1, 1)         t = rhot(i, 2)         tp = rhot(i + 1, 2)         if (rho .ge. r)             rid =         endif          if (te .ge. t)             tid =         endif     enddo      r1 = rhot(rid, 1)     r2 = rhot(rid + 1, 1)     t1 = rhot(tid, 2)     t2 = rhot(tid + 1, 2)     opc11 = opc(rid, tid, ig + 4)     opc12 = opc(rid, tid + 1, ig + 4)     opc21 = opc(rid + 1, tid, ig + 4)     opc22 = opc(rid + 1, tid + 1, ig + 4)      denominator = (r2 - r1) * (t2 - t1)     = r2 - rho     b = rho - r1     c = t2 - te     d = te - t1      interpolation = * (c * opc11 + d * opc12) + b * &             (c * opc21 + d * opc22)      smoothopc = interpolation / denominator      return  end function smoothopc

if understand correctly, doing loop through whole of rhot find indices use afterwards find values in opc.

if columns of rhot sorted, may faster write binary search instead (there overhead, have test).

also, don't quite understand condition assign rid , tid (it seem logical assign rid if r <= rho < rp). may missing how rhot built.

a trick might try: convert rho*m integer, m may instance power of 2 or 10 (multiply 1000 3 digits of precision). value rounded integer index array elements correct (or nearest) values of rid. if not correct rid, may have smaller range check. if scale not linear, may transform rho first.

another possible trick: store previous rid , tid indices. if calls follow rather continuous evolution, new indices near previous ones. if code has paralellized @ point, it's not idea introduces sequential dependences between calls.

Search This Blog

Insert

High performance lookup table in Fortran -

Comments

Post a Comment

Popular posts from this blog

vue.js - Create hooks for automated testing -

php - Vagrant up error - Uncaught Reflection Exception: Class DOMDocument does not exist -

serial port - hub4com OVERRUN Error -