blueice package¶

Submodules¶

blueice.data_reading module¶

Utilities for reading files specified in configuration dictionaries

blueice.data_reading.read_csv(filename)[source]¶

blueice.data_reading.read_files_in(d, data_dirs=('.', ))[source]¶: Return a new dictionary in which every value in d that is a string that ends in a supported extension is replaced with that file’s contents. Leave other keys alone. A cache is maintained to ensure things are only read once. :param data_dirs: directories to look for files. Defaults to ‘.’.

blueice.exceptions module¶

exception blueice.exceptions.BlueIceException[source]¶: Bases: Exception

exception blueice.exceptions.InvalidParameter[source]¶

Bases: blueice.exceptions.BlueIceException

A particular parameter to the likelihood is not present

exception blueice.exceptions.InvalidParameterSpecification[source]¶

Bases: blueice.exceptions.BlueIceException

An add_x_parameter method was called wrongly

exception blueice.exceptions.NoOpimizationNecessary[source]¶: Bases: blueice.exceptions.BlueIceException

exception blueice.exceptions.NoShapeParameters[source]¶: Bases: blueice.exceptions.BlueIceException

exception blueice.exceptions.NotPreparedException[source]¶: Bases: blueice.exceptions.BlueIceException

exception blueice.exceptions.OptimizationFailed[source]¶: Bases: blueice.exceptions.BlueIceException

exception blueice.exceptions.PDFNotComputedException[source]¶: Bases: blueice.exceptions.BlueIceException

blueice.inference module¶

Helper functions for analysis using LogLikelihood functions

Blueice’s main purpose in life is to provide you with a likelihood function, but some operations are so common we could not resist adding canned tools for them. Don’t worry if your particular analysis is not covered, just use whatever tools you want with either the LogLikelihood function itself, or the output of make_objective (if your tools expect a positional-argument only function).

Functions from this file are also made accesible as methods of LogLikelihoodBase.

blueice.inference.best_anchor(lf)[source]¶: Return shape parameter dictionary of anchor model with highest likelihood. Useful as a guess for further fitting.

blueice.inference.make_objective(lf, guess=None, minus=True, rates_in_log_space=False, **kwargs)[source]¶

Return convenient stuff for feeding the LogLikelihood lf to an optimizer. :param **kwargs: fixed values for certain parameters. These will not be fitted. :param guess: dictionary with guesses for the remaining (“floating”) parameters If you don’t supply a guess for a parameter, the base model setting will be taken as a guess. :param minus: if true (default), multiply result of LogLikelihood by -1.

Minimizers tend to appreciate this, samplers like MCMC do not.

Parameters:	rates_in_log_space – UNTESTED: let the minimizer work on the rate multipliers in log space instead

Returns f, guesses, names:

f: function which takes a single arraylike argument with only the floating parameters
names: list, floating parameter names in correct order
guesses: array of guesses in order taken by f
bounds: list of tuples of bounds for floating parameters. (None, None) if there are no bounds for a param.

blueice.inference.bestfit_scipy(lf, minimize_kwargs=None, rates_in_log_space=False, pass_bounds_to_minimizer=False, **kwargs)[source]¶

Minimizes the LogLikelihood function lf over the parameters not specified in kwargs. Returns {param: best fit}, maximum loglikelihood.

Optimization is performed with the scipy minimizer :param minimize_kwargs: dictionary with optimz to minimize :param pass_bounds_to_minimizer: if true (default is False), pass bounds to minimizer via the bounds argument. This shouldn’t be necessary, as the likelihood function returns -inf outside the bounds. I’ve gotten strange results with scipy’s L-BFGS-B, scipy’s default method with bound problems, perhaps it is less well tested? If you pass this, I recommend passing a different minimizer method (e.g. TNC or SLSQP).

Other kwargs are passed to make_objective.

blueice.inference.bestfit_minuit(lf, minimize_kwargs=None, rates_in_log_space=False, **kwargs)[source]¶

Minimizes the LogLikelihood function lf over the parameters not specified in kwargs. Returns {param: best fit}, maximum loglikelihood.

Optimization is performed with iminuits Minuit :param minimize_kwargs: dictionary with optimz to minimize

Other kwargs are passed to make_objective.

blueice.inference.plot_likelihood_ratio(lf, *space, vmax=15, bestfit_routine=None, plot_kwargs=None, **kwargs)[source]¶: Plots the - loglikelihood ratio derived from LogLikelihood lf in a parameter space :param lf: LogLikelihood function with data set. :param space: list/tuple of tuples (dimname, points to plot) :param vmax: Limit for color bar (2d) or y axis (1d) :param plot_kwargs: kwargs passed to plt.plot / plt.pcolormesh Further arguments are passed to lf, arguments not passed are fitted at each point. :return: Nothing

blueice.inference.one_parameter_interval(lf, target, bound, confidence_level=0.9, kind='upper', bestfit_routine=None, t_ppf=None, **kwargs)[source]¶

Set a confidence_level interval of kind (central, upper, lower) on the parameter target of lf. This assumes the likelihood ratio is asymptotically chi2(1) distributed (Wilk’s theorem) target: parameter of lf to constrain bound: bound(s) for the line search. For upper and lower: single value, for central: 2-tuple. t_ppf: function (hypothesis, level) -> test statistic (-2 Log[ L(test)/L(bestfit) ])

must return value at which test statistic reaches level’th quantile if hypothesis is true. If not specified, Wilks’ theorem will be used.

kwargs: dictionary with arguments to bestfit

blueice.inference.bestfit_emcee(ll, quiet=False, return_errors=False, return_samples=False, n_walkers=40, n_steps=200, n_burn_in=100, n_threads=1, **kwargs)[source]¶

Optimize the loglikelihood function ll using emcee’s MCMC. The starting position of the walkers is [0.95, 1.05] * the default values / any guess you put in. So if your default value is 0 you have to put in a custom guess. (TODO: fix this)

Parameters:

ll – LogLikelihood to optimize
quiet – if False (default), show corner plot and print out passthrough info
return_errors – if True, return a third result, dictionary with 1 sigma errors for each parameter
return_samples – if True, return a third result, flattened numpy array of samples visited (except in burn-in)
n_walkers – Number of walkers to use for the MCMC
n_steps – Number of steps to use for MCMC
n_burn_in – Number of burn-in steps to use. These are added to n_steps but thrown away.
n_threads – Number of concurrent threads to use
kwargs – Passed to ll.make_objective.

Returns:

{param: best fit}, maximum loglikelihood.

blueice.likelihood module¶

Log likelihood constructors: the heart of blueice.

class blueice.likelihood.LogLikelihoodBase(pdf_base_config, likelihood_config=None, **kwargs)[source]¶

Bases: object

Log likelihood function with several rate and/or shape parameters

likelihood_config options:: unphysical_behaviour outlier_likelihood parallelize_models: True (default) or False block_during_paralellization: True or False (default)

add_rate_parameter(source_name, log_prior=None)[source]¶: Add a rate parameter names source_name + “_rate_multiplier” to the likelihood function.. The values of this parameter will MULTIPLY the expected rate of events for the source. The rates of sources can also vary due to shape parameters. :param source_name: Name of the source for which you want to vary the rate :param log_prior: prior logpdf function on rate multiplier (not on rate itself!)

add_rate_uncertainty(source_name, fractional_uncertainty)[source]¶: Adds a rate parameter to the likelihood function with Gaussian prior

add_shape_parameter(setting_name, anchors, log_prior=None, base_value=None)[source]¶

Add a shape parameter to the likelihood function :param setting_name: Name of the setting to vary :param anchors: a list/tuple/array of setting values (if they are numeric)

OR a dictionary with some numerical value -> setting values (for non-numeric settings).

Parameters:	base_value – for non-numeric settings, the number which represents the base model value of the setting.

For example, if you have LCE maps with varying reflectivities, use: add_shape_variation(‘s1_relative_ly_map’, {0.98: ‘lce_98%.pklz’, 0.99: ‘lce_99%.pklz, …})

then the argument s1_relative_ly_map of the likelihood function takes values between 0.98 and 0.99.

add_shape_uncertainty(setting_name, fractional_uncertainty, anchor_zs=(-2, -1, 0, 1, 2), base_value=None)[source]¶: Adds a shape parameter to the likelihood function, with Gaussian prior around the default value. :param fractional uncertainty: Relative uncertainty on the default value. Other parameters as in add_shape_parameter.

adjust_expectations(mus, ps, n_model_events)[source]¶

Adjust uncertain (mus, pmfs) based on the observed data.

If the density is derived from a finite-statistics sample (n_model_events array of events per bin), we can take into account this uncertainty by modifying the likelihood function.

For a binned likelihood, this means adding the expected number of events for each bin for each source as nuisance parameters constrained by Poisson terms around the number of events observed in the model. While these nuisance parameters could be optimized numerically along with the main parameters, for a given value of the main parameters these per-bin nuisance parameters can often be estimated analytically, as shown by Beeston & Barlow (1993).

best_anchor(lf)¶: Return shape parameter dictionary of anchor model with highest likelihood. Useful as a guess for further fitting.

bestfit_emcee(ll, quiet=False, return_errors=False, return_samples=False, n_walkers=40, n_steps=200, n_burn_in=100, n_threads=1, **kwargs)¶

Optimize the loglikelihood function ll using emcee’s MCMC. The starting position of the walkers is [0.95, 1.05] * the default values / any guess you put in. So if your default value is 0 you have to put in a custom guess. (TODO: fix this)

Parameters:

ll – LogLikelihood to optimize
quiet – if False (default), show corner plot and print out passthrough info
return_errors – if True, return a third result, dictionary with 1 sigma errors for each parameter
return_samples – if True, return a third result, flattened numpy array of samples visited (except in burn-in)
n_walkers – Number of walkers to use for the MCMC
n_steps – Number of steps to use for MCMC
n_burn_in – Number of burn-in steps to use. These are added to n_steps but thrown away.
n_threads – Number of concurrent threads to use
kwargs – Passed to ll.make_objective.

Returns:

{param: best fit}, maximum loglikelihood.

bestfit_minuit(lf, minimize_kwargs=None, rates_in_log_space=False, **kwargs)¶

Minimizes the LogLikelihood function lf over the parameters not specified in kwargs. Returns {param: best fit}, maximum loglikelihood.

Optimization is performed with iminuits Minuit :param minimize_kwargs: dictionary with optimz to minimize

Other kwargs are passed to make_objective.

bestfit_scipy(lf, minimize_kwargs=None, rates_in_log_space=False, pass_bounds_to_minimizer=False, **kwargs)¶

Minimizes the LogLikelihood function lf over the parameters not specified in kwargs. Returns {param: best fit}, maximum loglikelihood.

Optimization is performed with the scipy minimizer :param minimize_kwargs: dictionary with optimz to minimize :param pass_bounds_to_minimizer: if true (default is False), pass bounds to minimizer via the bounds argument. This shouldn’t be necessary, as the likelihood function returns -inf outside the bounds. I’ve gotten strange results with scipy’s L-BFGS-B, scipy’s default method with bound problems, perhaps it is less well tested? If you pass this, I recommend passing a different minimizer method (e.g. TNC or SLSQP).

Other kwargs are passed to make_objective.

get_bounds(parameter_name=None)[source]¶: Return bounds on the parameter parameter_name

make_objective(lf, guess=None, minus=True, rates_in_log_space=False, **kwargs)¶

Return convenient stuff for feeding the LogLikelihood lf to an optimizer. :param **kwargs: fixed values for certain parameters. These will not be fitted. :param guess: dictionary with guesses for the remaining (“floating”) parameters If you don’t supply a guess for a parameter, the base model setting will be taken as a guess. :param minus: if true (default), multiply result of LogLikelihood by -1.

Minimizers tend to appreciate this, samplers like MCMC do not.

Parameters:	rates_in_log_space – UNTESTED: let the minimizer work on the rate multipliers in log space instead

Returns f, guesses, names:

f: function which takes a single arraylike argument with only the floating parameters
names: list, floating parameter names in correct order
guesses: array of guesses in order taken by f
bounds: list of tuples of bounds for floating parameters. (None, None) if there are no bounds for a param.

one_parameter_interval(lf, target, bound, confidence_level=0.9, kind='upper', bestfit_routine=None, t_ppf=None, **kwargs)¶

Set a confidence_level interval of kind (central, upper, lower) on the parameter target of lf. This assumes the likelihood ratio is asymptotically chi2(1) distributed (Wilk’s theorem) target: parameter of lf to constrain bound: bound(s) for the line search. For upper and lower: single value, for central: 2-tuple. t_ppf: function (hypothesis, level) -> test statistic (-2 Log[ L(test)/L(bestfit) ])

must return value at which test statistic reaches level’th quantile if hypothesis is true. If not specified, Wilks’ theorem will be used.

kwargs: dictionary with arguments to bestfit

plot_likelihood_ratio(lf, *space, vmax=15, bestfit_routine=None, plot_kwargs=None, **kwargs)¶: Plots the - loglikelihood ratio derived from LogLikelihood lf in a parameter space :param lf: LogLikelihood function with data set. :param space: list/tuple of tuples (dimname, points to plot) :param vmax: Limit for color bar (2d) or y axis (1d) :param plot_kwargs: kwargs passed to plt.plot / plt.pcolormesh Further arguments are passed to lf, arguments not passed are fitted at each point. :return: Nothing

prepare(n_cores=1, ipp_client=None)[source]¶: Prepares a likelihood function with shape parameters for use. This will compute the models for each shape parameter anchor value combination.

set_data(d)[source]¶: Prepare the dataset d for likelihood function evaluation :param d: Dataset, must be an indexable object that provides the measurement dimensions For example, if your models are on ‘s1’ and ‘s2’, d must be something for which d[‘s1’] and d[‘s2’] give the s1 and s2 values of your events as numpy arrays.

class blueice.likelihood.BinnedLogLikelihood(pdf_base_config, likelihood_config=None, **kwargs)[source]¶

Bases: blueice.likelihood.LogLikelihoodBase

adjust_expectations(mus, pmfs, n_model_events)[source]¶

Adjust uncertain (mus, pmfs) based on the observed data.

If the density is derived from a finite-statistics sample (n_model_events array of events per bin), we can take into account this uncertainty by modifying the likelihood function.

For a binned likelihood, this means adding the expected number of events for each bin for each source as nuisance parameters constrained by Poisson terms around the number of events observed in the model. While these nuisance parameters could be optimized numerically along with the main parameters, for a given value of the main parameters these per-bin nuisance parameters can often be estimated analytically, as shown by Beeston & Barlow (1993).

prepare(*args)[source]¶: Prepares a likelihood function with shape parameters for use. This will compute the models for each shape parameter anchor value combination.

set_data(d)[source]¶: Prepare the dataset d for likelihood function evaluation :param d: Dataset, must be an indexable object that provides the measurement dimensions For example, if your models are on ‘s1’ and ‘s2’, d must be something for which d[‘s1’] and d[‘s2’] give the s1 and s2 values of your events as numpy arrays.

class blueice.likelihood.UnbinnedLogLikelihood(pdf_base_config, likelihood_config=None, **kwargs)[source]¶

Bases: blueice.likelihood.LogLikelihoodBase

set_data(d)[source]¶: Prepare the dataset d for likelihood function evaluation :param d: Dataset, must be an indexable object that provides the measurement dimensions For example, if your models are on ‘s1’ and ‘s2’, d must be something for which d[‘s1’] and d[‘s2’] give the s1 and s2 values of your events as numpy arrays.

class blueice.likelihood.LogLikelihoodSum(likelihood_list)[source]¶

Bases: object

Class that takes a list of likelihoods to be minimized together, and provides an interface to the inference methods and evaluation similar to likelihoods. Note that the pfd_base_config is a bit of a fudge; only storing guesses from the last likelihood. As different guesses for different likelihoods should be a cause for concern, the safest method is to pass manual guesses to the minimization.

best_anchor(lf)¶: Return shape parameter dictionary of anchor model with highest likelihood. Useful as a guess for further fitting.

bestfit_emcee(ll, quiet=False, return_errors=False, return_samples=False, n_walkers=40, n_steps=200, n_burn_in=100, n_threads=1, **kwargs)¶

Optimize the loglikelihood function ll using emcee’s MCMC. The starting position of the walkers is [0.95, 1.05] * the default values / any guess you put in. So if your default value is 0 you have to put in a custom guess. (TODO: fix this)

Parameters:

ll – LogLikelihood to optimize
quiet – if False (default), show corner plot and print out passthrough info
return_errors – if True, return a third result, dictionary with 1 sigma errors for each parameter
return_samples – if True, return a third result, flattened numpy array of samples visited (except in burn-in)
n_walkers – Number of walkers to use for the MCMC
n_steps – Number of steps to use for MCMC
n_burn_in – Number of burn-in steps to use. These are added to n_steps but thrown away.
n_threads – Number of concurrent threads to use
kwargs – Passed to ll.make_objective.

Returns:

{param: best fit}, maximum loglikelihood.

bestfit_minuit(lf, minimize_kwargs=None, rates_in_log_space=False, **kwargs)¶

Minimizes the LogLikelihood function lf over the parameters not specified in kwargs. Returns {param: best fit}, maximum loglikelihood.

Optimization is performed with iminuits Minuit :param minimize_kwargs: dictionary with optimz to minimize

Other kwargs are passed to make_objective.

bestfit_scipy(lf, minimize_kwargs=None, rates_in_log_space=False, pass_bounds_to_minimizer=False, **kwargs)¶

Minimizes the LogLikelihood function lf over the parameters not specified in kwargs. Returns {param: best fit}, maximum loglikelihood.

Optimization is performed with the scipy minimizer :param minimize_kwargs: dictionary with optimz to minimize :param pass_bounds_to_minimizer: if true (default is False), pass bounds to minimizer via the bounds argument. This shouldn’t be necessary, as the likelihood function returns -inf outside the bounds. I’ve gotten strange results with scipy’s L-BFGS-B, scipy’s default method with bound problems, perhaps it is less well tested? If you pass this, I recommend passing a different minimizer method (e.g. TNC or SLSQP).

Other kwargs are passed to make_objective.

get_bounds(parameter_name=None)[source]¶: Return bounds on the parameter parameter_name

make_objective(lf, guess=None, minus=True, rates_in_log_space=False, **kwargs)¶

Return convenient stuff for feeding the LogLikelihood lf to an optimizer. :param **kwargs: fixed values for certain parameters. These will not be fitted. :param guess: dictionary with guesses for the remaining (“floating”) parameters If you don’t supply a guess for a parameter, the base model setting will be taken as a guess. :param minus: if true (default), multiply result of LogLikelihood by -1.

Minimizers tend to appreciate this, samplers like MCMC do not.

Parameters:	rates_in_log_space – UNTESTED: let the minimizer work on the rate multipliers in log space instead

Returns f, guesses, names:

f: function which takes a single arraylike argument with only the floating parameters
names: list, floating parameter names in correct order
guesses: array of guesses in order taken by f
bounds: list of tuples of bounds for floating parameters. (None, None) if there are no bounds for a param.

one_parameter_interval(lf, target, bound, confidence_level=0.9, kind='upper', bestfit_routine=None, t_ppf=None, **kwargs)¶

Set a confidence_level interval of kind (central, upper, lower) on the parameter target of lf. This assumes the likelihood ratio is asymptotically chi2(1) distributed (Wilk’s theorem) target: parameter of lf to constrain bound: bound(s) for the line search. For upper and lower: single value, for central: 2-tuple. t_ppf: function (hypothesis, level) -> test statistic (-2 Log[ L(test)/L(bestfit) ])

must return value at which test statistic reaches level’th quantile if hypothesis is true. If not specified, Wilks’ theorem will be used.

kwargs: dictionary with arguments to bestfit

plot_likelihood_ratio(lf, *space, vmax=15, bestfit_routine=None, plot_kwargs=None, **kwargs)¶: Plots the - loglikelihood ratio derived from LogLikelihood lf in a parameter space :param lf: LogLikelihood function with data set. :param space: list/tuple of tuples (dimname, points to plot) :param vmax: Limit for color bar (2d) or y axis (1d) :param plot_kwargs: kwargs passed to plt.plot / plt.pcolormesh Further arguments are passed to lf, arguments not passed are fitted at each point. :return: Nothing

split_results(result_dict)[source]¶

blueice.model module¶

class blueice.model.Model(config, **kwargs)[source]¶

Bases: object

Model for dataset simulation and analysis: collects several Sources, which do the actual work

expected_events(s=None)[source]¶: Return the total number of events expected in the analysis range for the source s. If no source specified, return an array of results for all sources. # TODO: Why is this not a method of source?

get_source(source_id)[source]¶

get_source_i(source_id)[source]¶

pmf_grids()[source]¶: Return array (n_sources, *analysis_space_shape) of integrated PDFs in the analysis space for each source

range_cut(d)[source]¶: Return events from dataset d which are inside the bounds of the analysis space

score_events(d)[source]¶: Returns array (n_sources, n_events) of pdf values for each source for each of the events

show(d, ax=None, dims=None, **kwargs)[source]¶: Plot the events from dataset d in the analysis range ax: plot on this Axes Dims: numbers of dimension(s) to plot in. Can be up to two dimensions.

simulate(rate_multipliers=None, livetime_days=None)[source]¶: Makes a toy dataset, poisson sampling simulated events from all sources. :param rate_multipliers: dict {source name: multiplier} to change rate of individual sources :param livetime_days: days of exposure to simulate (affects rate of all sources)

to_analysis_dimensions(d)[source]¶: Given a dataset, returns list of arrays of coordinates of the events in the analysis dimensions

blueice.parallel module¶

Parallel and delayed computation of models/sources for blueice.

blueice.parallel.create_models_ipyparallel(configs, ipp_client=None, block=False)[source]¶

Return Models for each configuration in configs. :param ipp_client: ipyparallel client to use for parallelized computation, or None (in which case models will be

computed serially. For now only engines running in the same directory as the main code are supported, see #1.

Parameters:	configs – list of Model configuration dictionaries block – passed to the async map of ipyparallel. Useful for debugging, but disables progress bar.
Returns:	list of Models.

blueice.parallel.compute_single(hash, task_dir='pdf_tasks', result_dir='pdf_cache')[source]¶: Computes a single source PDF from a saved task file

blueice.parallel.compute_many(hashes, n_cpus=1, *args, **kwargs)[source]¶

blueice.parallel.compute_all(input_dir='./pdf_cache', *args, **kwargs)[source]¶

blueice.pdf_morphers module¶

Morphers: interpolate multidimensional functions of models

class blueice.pdf_morphers.Morpher(config, shape_parameters)[source]¶

Bases: object

get_anchor_points(bounds, n_models=None)[source]¶: Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.

make_interpolator(f, extra_dims, anchor_models)[source]¶: Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.

class blueice.pdf_morphers.GridInterpolator(config, shape_parameters)[source]¶

Bases: blueice.pdf_morphers.Morpher

get_anchor_points(bounds, n_models=None)[source]¶: Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.

make_interpolator(f, extra_dims, anchor_models)[source]¶: Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.

class blueice.pdf_morphers.RadialInterpolator(config, shape_parameters)[source]¶

Bases: blueice.pdf_morphers.Morpher

This morpher is highly experimental!!

get_anchor_points(bounds, n_models=10)[source]¶: Returns list of anchor z-coordinates at which we should sample n_models between bounds. The morpher may choose to ignore your bounds and n_models argument if it doesn’t support them.

make_interpolator(f, extra_dims, anchor_models)[source]¶: Return a function which interpolates the extra_dims-valued function f(model) between the anchor points. :param f: Function which takes a Model as argument, and produces an extra_dims shaped array. :param extra_dims: tuple of integers, shape of return value of f. :param anchor_models: dictionary {z-score: Model} of anchor models at which to evaluate f.

blueice.pdf_morphers.latin(n, d, box=None, shuffle_steps=500)[source]¶: Creates a latin hypercube of n points in d dimensions Stolen from https://github.com/paulknysh/blackbox

blueice.source module¶

Built-in Source baseclasses. In order of increasing functionality and decreasing generality:

Source: only sets up default arguments and helper functions for caching. Use e.g. if you have an analytic pdf

HistogramPdfSource: + fetch/interpolate the PDF/PMF from a (multihist) histogram Use e.g. if you have a numerically computable pdf (e.g. using convolution of some known functions)

DensityEstimatingSource: + create that histogram by binning some sample of events Use e.g. if you want to estimate density from a calibration data sample.

MonteCarloSource: + get that sample from the source’s own simulate method. Use if you have a Monte Carlo to generate events. This was the original ‘niche’ for which blueice was created.

Parent methods (e.g. Source.compute_pdf) are meant to be called at the end of the child methods that override them (e.g. HistogramPdfSource.compute_pdf).

class blueice.source.Source(config, *args, **kwargs)[source]¶

Bases: object

Base class for a source of events.

compute_pdf()[source]¶: Initialize, then cache the PDF. This is called * AFTER the config initialization and * ONLY when source is not already loaded from cache. The caching mechanism exists to store the quantities you

need to compute here.

get_pmf_grid(*args)[source]¶

Returns pmf_grid, n_events:

pmf_grid: pmf per bin in the analysis space
n_events: if events were used for density estimation: number of events per bin (for DensityEstimatingSource) otherwise float(‘inf’)

This is used by binned likelihoods. if you have an unbinned density estimator, you’ll have to write some integration / sampling routine!

pdf(*args)[source]¶

prepare_task()[source]¶: Create a task file in the task_dir for delayed/remote computation

save_to_cache()[source]¶: Save attributes in self.config[‘cache_attributes’] of this source to cache.

simulate(n_events)[source]¶: Simulate n_events according to the source. It’s ok to return less than n_events events, if you decide some events are not detectable.

class blueice.source.HistogramPdfSource(config, *args, **kwargs)[source]¶

Bases: blueice.source.Source

A source which takes its PDF values from a multihist histogram.

build_histogram()[source]¶: Set the _pdf_histogram (Histdd), _n_events_histogram (Histdd) and _bin_volumes (numpy array) attributes

compute_pdf()[source]¶

get_pmf_grid()[source]¶

pdf(*args)[source]¶

simulate(n_events)[source]¶: Simulate n_events from the PDF histogram

class blueice.source.DensityEstimatingSource(config, *args, **kwargs)[source]¶

Bases: blueice.source.HistogramPdfSource

A source which estimates its PDF by some events you give to it. Child classes need to implement get_events_for_density_estimate, and call compute_pdf when they are ready (usually at the end of their own init).

build_histogram()[source]¶

get_events_for_density_estimate()[source]¶: Return, or yield in batches, (events for use in density estimation, events simulated/read) Passing the count is necessary because you sometimes work with simulators that already cut some events.

class blueice.source.MonteCarloSource(config, *args, **kwargs)[source]¶

Bases: blueice.source.DensityEstimatingSource

A DensityEstimatingSource which gets the events for the density estimator from its own simulate() method. Child classes have to implement simulate.

get_events_for_density_estimate()[source]¶

blueice.test_helpers module¶

Common code for tests. The tests themselves are located in ../tests, but need to import this, so…

class blueice.test_helpers.FixedSampleSource(*args, **kwargs)[source]¶

Bases: blueice.source.DensityEstimatingSource

get_events_for_density_estimate()[source]¶

class blueice.test_helpers.GaussianMCSource(config, *args, **kwargs)[source]¶

Bases: blueice.test_helpers.GaussianSourceBase, blueice.source.MonteCarloSource

Analog of GaussianSource which generates its PDF from MC

class blueice.test_helpers.GaussianSource(*args, **kwargs)[source]¶

Bases: blueice.test_helpers.GaussianSourceBase

A 1d source with a Gaussian PDF – useful for testing If your sources are as simple as this, you probably don’t need blueice!

compute_pdf()[source]¶

pdf(*args)[source]¶

class blueice.test_helpers.GaussianSourceBase(config, *args, **kwargs)[source]¶

Bases: blueice.source.Source

Analog of GaussianSource which generates its events by PDF

simulate(n_events)[source]¶

blueice.test_helpers.almost_equal(a, b, fraction=1e-06)[source]¶

blueice.test_helpers.make_data(instructions)[source]¶

make_data([dict(n_events=24, x=0.5),: dict(n_events=56, x=1.5)]):

produces 25 events with x=0.5 and 56 events with x=1.5 :return: numpy record array accepted by set_data

blueice.test_helpers.test_conf(n_sources=1, mc=False, **kwargs)[source]¶

blueice.utils module¶

blueice.utils.inherit_docstring_from(cls)[source]¶: Decorator for inheriting doc strings, stolen from https://groups.google.com/forum/#!msg/comp.lang.python/HkB1uhDcvdk/lWzWtPy09yYJ

blueice.utils.combine_dicts(*args, exclude=(), deep_copy=False)[source]¶: Returns a new dict with entries from all dicts passed, with later dicts overriding earlier ones. :param exclude: Remove these keys from the result. :param deepcopy: Perform a deepcopy of the dicts before combining them.

blueice.utils.data_file_name(filename, data_dirs=None)[source]¶: Returns filename if a file exists. Also checks data_dirs for the file.

blueice.utils.find_file_in_folders(filename, folders)[source]¶: Searches for filename in folders, then return full path or raise FileNotFoundError Does not recurse into subdirectories

blueice.utils.read_pickle(filename)[source]¶

blueice.utils.save_pickle(stuff, filename)[source]¶: Saves stuff in a pickle at filename

blueice.utils.hashablize(obj)[source]¶: Convert a container hierarchy into one that can be hashed. See http://stackoverflow.com/questions/985294

blueice.utils.deterministic_hash(thing)[source]¶: Return a deterministic hash of a container hierarchy using hashablize, pickle and sha1

class blueice.utils.InterpolateAndExtrapolate1D(points, values)[source]¶

Bases: object

Extends scipy.interpolate.interp1d to do constant extrapolation outside of the data range

blueice.utils.arrays_to_grid(arrs)[source]¶: Convert a list of n 1-dim arrays to an n+1-dim. array, where last dimension denotes coordinate values at point.

blueice package¶

Submodules¶

blueice.data_reading module¶

blueice.exceptions module¶

blueice.inference module¶

blueice.likelihood module¶

blueice.model module¶

blueice.parallel module¶

blueice.pdf_morphers module¶

blueice.source module¶

blueice.test_helpers module¶

blueice.utils module¶

Module contents¶