A test of leading AI agents found vastly different amounts of tokens consumed with no transparency and no guarantees of ...