Browse Source

Adds updates to Nov 19 presentation.

Caleb Fangmeier 5 years ago
parent
commit
71f211680f

BIN
docs/presentations/2018_10_24/figures/gsf_track_kinem2d_dy_noskip.png


BIN
docs/presentations/2018_10_24/figures/gsf_track_kinem2d_dy_skip-pileup.png


BIN
docs/presentations/2018_10_24/figures/gsf_track_kinem2d_dy_skip.png


BIN
docs/presentations/2018_10_24/figures/gsf_track_kinem_dy_noskip.png


BIN
docs/presentations/2018_10_24/figures/gsf_track_kinem_dy_skip-pileup.png


BIN
docs/presentations/2018_10_24/figures/gsf_track_kinem_dy_skip.png


BIN
docs/presentations/2018_10_24/figures/number_of_good_seeds_dy_skip-pileup.png


BIN
docs/presentations/2018_10_24/figures/number_of_good_seeds_dy_skip.png


BIN
docs/presentations/2018_10_24/figures/seeding_performance.png


BIN
docs/presentations/2018_10_24/main.pdf


+ 94 - 16
docs/presentations/2018_10_24/main.tex

@@ -1,6 +1,3 @@
-
-% rubber: module pdftex
-
 \documentclass[english,aspectratio=43,8pt]{beamer}
 \usepackage{graphicx}
 \usepackage{amssymb}
@@ -19,6 +16,17 @@
 \hypersetup{colorlinks=true,urlcolor=blue}
 \usetheme[]{bjeldbak}
 
+
+%%% For wider frames
+\newcommand\Wider[2][3em]{%
+\makebox[\linewidth][c]{%
+  \begin{minipage}{\dimexpr\textwidth+#1\relax}
+  \raggedright#2
+  \end{minipage}%
+  }%
+}
+%%% For wider frames end
+
 \newcommand{\backupbegin}{%
    \newcounter{finalframe}
    \setcounter{finalframe}{\value{framenumber}}
@@ -39,7 +47,7 @@
 \title[$e$ Seeding Validation]{Offline Electron Seeding Validation \-- Update}
 \author[C. Fangmeier]{\textbf{Caleb Fangmeier} \\ Ilya Kravchenko,  Greg Snow}
 \institute[UNL]{University of Nebraska \-- Lincoln}
-\date{EGM Reco/Comm/HLT meeting | June 22, 2018}
+\date{EGM General Meeting | November 19, 2018}
 
 \titlegraphic{%
 \begin{figure}
@@ -55,12 +63,14 @@
   \begin{itemize}
     \item Our goal is to study \textbf{seeding} for the \textbf{offline} GSF tracking with the \textbf{Phase I pixel detector}.
     \item Specifically, we want to optimize the new pixel-matching scheme from HLT for use in off-line reconstruction.
+    \item Previous presentation\footnotemark showed efficiency/purity/fake-rate for proposed offline electron seeding working points.
     \item This Talk:
       \begin{itemize}
         \item Explain ``Hit Skipping'' and demonstrate effects on seeding performance
-        \item Compare performance with pileup added
+        \item Examine effects of adding pileup on seeding performance.
       \end{itemize}
   \end{itemize}
+  \footnotetext[1]{\tiny \url{https://indico.cern.ch/event/697084/#2-update-on-offline-electron-s}}
 \end{frame}
 
 
@@ -122,7 +132,7 @@
       \begin{itemize}
         \item When \texttt{NHitElectronSeedProducer} was implemented for HLT, hit skipping was not added.
         \item Consider an example configuration where we are generating first quadruplet, then triplet, and then finally doublet seeds, masking hits along the way.
-        \item If we require at least 3 matched hits, the old method \emph{with} hit skipping would create a seed of hits \texttt{BPIX1, BPIX2, BPIX3}.
+        \item If we require at least 3 matched hits, the old method \emph{with} hit skipping would create a seed of hits \texttt{BPIX1, BPIX2, BPIX4}.
         \item But new method \emph{without} hit skipping wouldn't make any seed from these hits.
         \item The ``hack'' is to create seeds using only steps \texttt{tripletElectronSeeds}, and \texttt{pixelPairElectronSeeds} with \textbf{no masking}.
         \item Adding skipping and removing the hack would reduce cpu time from redundant seeds.
@@ -143,10 +153,10 @@
   \begin{columns}[t]
     \begin{column}{0.5\textwidth}
       \begin{itemize}
-        \item Enabling hit skipping and removing hack reduces number of seeds by 35\% to 50\%.
+        \item Enabling hit skipping and removing hack reduces number of seeds by 36\% to 51\%.
         \item 3-5x fewer seeds with respect to old seeding
         \item Efficiency reduced by between 4\% and 6\% to align more with old seeding performance.
-        \item Purity improved by between about 1\%.
+        \item Purity improved by about 1\%.
         \item (table in backup)
       \end{itemize}
     \end{column}
@@ -177,9 +187,8 @@ Drell-Yan  & Old - default settings  & -                      & 11.40
 
 \begin{frame}{Adding Pileup}
       \begin{itemize}
-        % \item All previous results are \emph{without} pileup.
         \item The simhit-rechit linkage that was previously used in efficiency/purity measurements is not saved in \texttt{GEN-SIM-RAW}.
-        \item Therefore, the \texttt{DIGI} step was re-run, but only for the signal event.
+        \item Therefore, the \texttt{DIGI} step was re-run, but \emph{only for the signal event} because \texttt{GEN-SIM-RAW} does not contain \texttt{SIM} information for pileup events.
         \item However, running this instead of the \texttt{RAW2DIGI} step discarded the previously mixed pileup in the \texttt{RAW}.
         \item So even though there is a \texttt{PileupInfo} collection with reasonable values, there's no actual pileup hits being used for tracking (caused quite some confusion for me).
         \item In the end, abandon simhit-rechit linkage and just use $\Delta R$ matching for efficiency/purity.
@@ -193,16 +202,77 @@ Drell-Yan  & Old - default settings  & -                      & 11.40
   \end{columns}
 \end{frame}
 
-% ask for conclusion to project and find out
-% - what changes need to be made
-% - who is going to implement them
+\begin{frame}{Adding Pileup - Issues}
+      \begin{itemize}
+        % \item All previous results are \emph{without} pileup.
+        \item Creating kinematic distributions raises some apparent issues with how the new seeding handles pileup.
+        \item Next three slides show pt/eta/phi distributions of GSF tracks resulting from ECAL-Driven seeds
+      \end{itemize}
+      % \begin{figure}
+      %   \includegraphics[width=0.32\textwidth]{figures/gsf_track_kinem_dy_skip.png}
+      %   \includegraphics[width=0.32\textwidth]{figures/gsf_track_kinem_dy_skip-pileup.png}
+      % \end{figure}
+\end{frame}
+
+\begin{frame}{Adding Pileup - Issues - No Skipping, No Pileup}
+\Wider{
+      \begin{figure}
+        \includegraphics[width=0.49\textwidth]{figures/gsf_track_kinem_dy_noskip.png}
+        \includegraphics[width=0.49\textwidth]{figures/gsf_track_kinem2d_dy_noskip.png}
+      \end{figure}
+      \begin{itemize}
+        \item Looks basically ok, use this as a baseline for comparison.
+      \end{itemize}
+}
+\end{frame}
+
+\begin{frame}{Adding Pileup - Issues - With Skipping, No Pileup}
+\Wider{
+      \begin{figure}
+        \includegraphics[width=0.49\textwidth]{figures/gsf_track_kinem_dy_skip.png}
+        \includegraphics[width=0.49\textwidth]{figures/gsf_track_kinem2d_dy_skip.png}
+      \end{figure}
+      \begin{itemize}
+        \item Concerning dip around $\phi=3$ coming from $\eta \in (0.5, 1.5)$.
+      \end{itemize}
+}
+\end{frame}
+
+\begin{frame}{Adding Pileup - Issues - With Skipping, With Pileup}
+\Wider{
+      \begin{figure}
+        \includegraphics[width=0.49\textwidth]{figures/gsf_track_kinem_dy_skip-pileup.png}
+        \includegraphics[width=0.49\textwidth]{figures/gsf_track_kinem2d_dy_skip-pileup.png}
+      \end{figure}
+      \begin{itemize}
+        \item Strangely non-flat $\phi$ distribution, manifests differently in both old and new seeding, but rather more pronounced in new.
+        \item Features seem to be somewhat localized in $\phi-\eta$, possibly some kind of detector effect?
+      \end{itemize}
+}
+\end{frame}
+
+
+\begin{frame}{Adding Pileup - More Issues}
+\Wider{
+  \begin{figure}
+    \includegraphics[width=0.49\textwidth]{figures/number_of_good_seeds_dy_skip.png}
+    \includegraphics[width=0.49\textwidth]{figures/number_of_good_seeds_dy_skip-pileup.png}
+  \end{figure}
+  \begin{itemize}
+    \item Relative reduction of number of seeds gone. In fact, with pileup there is a 2.3 to 5 times increase in the number of seeds relative to old seeding method.
+  \end{itemize}
+}
+\end{frame}
 
 \begin{frame}{Conclusions \& Outlook}
   \begin{itemize}
-    \item TODO
+    \item Reintroducing hit skipping in new seeding is implemented and has expected results.
+    \item Analyzing MC with pileup mixed highlights potential issues with using new seeding for offline reconstruction.
+    \item Further investigation is necessary to determine the source of the issues.
+    \item Any ideas for checks or fixes are welcome!
   \end{itemize}
   \blfootnote{\tiny Analysis and ploting code is available at \url{https://git.fangmeier.tech/caleb/EGamma\_ElectronTrackingValidation}}
-  % \blfootnote{\tiny Additional plots are available at \url{https://eg.fangmeier.tech/seeding\_studies\_2018\_06\_20\_17/hists.html}}
+  \blfootnote{\tiny Additional plots are available at \url{http://t3.unl.edu/~cfangmeier/eg/seeding\_studies\_2018\_11\_18\_19/hists.html}}
 \end{frame}
 
 \appendix
@@ -228,6 +298,12 @@ Drell-Yan  & Old - default settings  & -                      & 11.40
   \end{itemize}
 \end{frame}
 
+\begin{frame}{Seeding Performance}
+  \begin{figure}
+    \includegraphics[height=0.945\textheight]{figures/seeding_performance.png}
+  \end{figure}
+\end{frame}
+
 \begin{frame}{Matching Window Parameters}
 \begin{table}[]
 \centering
@@ -310,7 +386,7 @@ cmsDriver.py Step2ToTrackingNtuple \
 \begin{verbatim}
 cmsDriver.py RAW2TrackingNtuple \
     --mc \
-    --conditions 92X_upgrade2017_realistic_v7 \
+    --conditions 92X_upgrade2017_realistic_v10 \
     --era Run2_2017  \
     --eventcontent FEVTDEBUG \
     --datatier GEN-SIM-RECO \
@@ -321,7 +397,9 @@ cmsDriver.py RAW2TrackingNtuple \
     --fileout file:trackingNtuple.root \
     --runUnscheduled
 \end{verbatim}
+Additionally, hacks to remove hit truth dependencies from \texttt{TrackingNtuple}
 }
+
   \end{column}
   \end{columns}
 

+ 170 - 150
plotting/eff_plots.py

@@ -301,16 +301,17 @@ def plot_roc_curve(pfx, ext=''):
     for i, proc in enumerate(procs):
         plt.subplot(len(procs), 1, i+1)
         row_labels.append(procs[proc])
-        row_labels.extend(['']*(len(wps)-1))
+        row_labels.extend(['']*(len(wps)*len(configs)-2))
         for wp, config in product(wps, configs):
             sample = samples[(proc, wp, config)]
             sample_name = f'{proc}-{wp}-{config}'
+            if wp == 'old-default' and config == 'noskip': continue
             eff, eff_err = get_num_den(sample, f'{pfx}_eff_v_phi{ext}')
             pur, pur_err = get_num_den(sample, f'{pfx}_pur_v_phi{ext}')
             if show_fr:
                 fr, fr_err = get_num_den(sample, f'fake_rate_no_e_match_v_phi')
 
-                rows.append([wp,
+                rows.append([wp, config,
                              rf'${eff*100:0.2f}\pm{eff_err*100:0.2f}\%$',
                              rf'${pur*100:0.2f}\pm{pur_err*100:0.2f}\%$',
                              rf'${fr*100:0.2f}\pm{fr_err*100:0.2f}\%$'])
@@ -327,26 +328,26 @@ def plot_roc_curve(pfx, ext=''):
 
         # center_text(0.3, 0.3, r'$p_T>20$ and $|\eta|<2.5$')
         # plt.axis('equal')
-        # plt.xlim((0.5, 1.02))
-        # plt.ylim((0.5, 1.02))
-        plt.xlim((0, 1.02))
-        plt.ylim((0.7, 1.02))
+        plt.xlim((0.5, 1.02))
+        plt.ylim((0.5, 1.02))
+        # plt.xlim((0, 1.02))
+        # plt.ylim((0.7, 1.02))
         plt.ylabel('Efficiency')
         plt.grid()
-        plt.legend(loc='lower right', ncol=2, fancybox=True, numpoints=1)
+        plt.legend(loc='lower right', ncol=1, fancybox=True, numpoints=1)
     plt.xlabel('Purity')
 
-    col_labels = ['Sample', 'Working Point', 'Efficiency', 'Purity']
+    col_labels = ['Sample', 'Working Point', 'Config', 'Efficiency', 'Purity']
     if show_fr:
         col_labels.append("Fake Rate")
     return to_html_table(rows, col_labels, row_labels, 'table-condensed')
 
 
 @mpb.decl_fig
-def plot_kinematic_eff(pref, ext='', ylim=(None, None), norm=None, label_pfx='', incl_sel=True,
+def plot_kinematic_eff(pref, proc, config, ext='', ylim=(None, None), norm=None, label_pfx='', incl_sel=True,
                        bins_pt=None, bins_eta=None, bins_phi=None, bins_PU=None,
                        xlim_pt=(None, None), xlim_eta=(None, None), xlim_phi=(None, None),
-                       is_ratio=False, config=None):
+                       is_ratio=False):
     load_samples()
     # Figure out if this one has v_PU
     has_PU =  f'{pref}_v_PU{ext}' in list(samples.values())[0]
@@ -357,10 +358,9 @@ def plot_kinematic_eff(pref, ext='', ylim=(None, None), norm=None, label_pfx='',
     if has_PU:
         ax_PU = plt.subplot(224)
     errors = True
-    for (proc, wp, config_), sample in samples.items():
-        if config is not None and config_ != config:
-            continue
-        sample_name = f'{proc}-{wp}-{config_}'
+    for (proc_, wp, config_), sample in samples.items():
+        if proc != proc_ or config != config_: continue
+        sample_name = f'{proc}-{wp}-{config}'
         l = sample_name
         c = color(proc, wp)
 
@@ -377,9 +377,9 @@ def plot_kinematic_eff(pref, ext='', ylim=(None, None), norm=None, label_pfx='',
                 h = Hist1D(sample[name], no_overflow=True)
                 if norm:
                     h = h / (norm*h.integral)
-                if bins:
+                if bins is not None:
                     h.rebin(bins)
-            hist_plot(h, include_errors=errors, label=l, color=c, linestyle=style(config_))
+            hist_plot(h, include_errors=errors, label=l, color=c, linestyle=style(config))
 
         do_plot(ax_pt, f'{pref}_v_pt{ext}', bins_pt)
         do_plot(ax_eta, f'{pref}_v_eta{ext}', bins_eta)
@@ -416,7 +416,33 @@ def plot_kinematic_eff(pref, ext='', ylim=(None, None), norm=None, label_pfx='',
     else:
         plt.tight_layout()
         plt.legend(loc='upper left', bbox_to_anchor=(0.6, 0.45), bbox_transform=plt.gcf().transFigure,
-                   prop={'size': 20})
+                   prop={'size': 15})
+
+
+@mpb.decl_fig
+def plot_kinematic_eff2d(pref, proc, config, ext=''):
+    from math import sqrt, ceil
+    load_samples()
+
+    n_col = ceil(sqrt(len(wps)))
+    n_row = ceil(len(wps) / n_col)
+
+    plt_idx = 0
+    for (proc_, wp, config_), sample in samples.items():
+        if proc != proc_ or config != config_: continue
+        sample_name = f'{proc}-{wp}-{config}'
+        plt_idx += 1
+
+        plt.subplot(n_row, n_col, plt_idx)
+        h = Hist2D(sample[f'{pref}_v_eta_phi{ext}'], no_overflow=True)
+        h.rebin(div_bins_x=4)
+        plot_2d(h)
+        plt.text(0.1, 0.1, sample_name,
+                 transform=plt.gca().transAxes, size=15, backgroundcolor='#FFFFFFA0')
+        if (plt_idx-1) % n_col == 0:  # left col of plots
+            plt.ylabel(r'$\phi$')
+        if plt_idx > n_col*(n_row-1):  # bottom row of plots
+            plt.xlabel(r'$\eta$')
 
 
 @mpb.decl_fig
@@ -460,10 +486,14 @@ def plot_res_contour(proc, hit_number, var, layers, ext='_TrackMatched'):
 
 
 @mpb.decl_fig
-def simple_dist(hist_name, rebin=(), norm=1, xlabel="", ylabel="", xlim=None, ylim=None, line_width=1):
+def simple_dist(hist_name, proc=None, wp=None, config=None,
+                rebin=(), norm=1, xlabel="", ylabel="", xlim=None, ylim=None, line_width=1):
     load_samples()
-    for (proc, wp, config), sample in samples.items():
-        sample_name = f'{proc}-{wp}-{config}'
+    for (proc_, wp_, config_), sample in samples.items():
+        if proc and proc_ != proc: continue
+        if wp and wp_ != wp: continue
+        if config and config_ != config: continue
+        sample_name = f'{proc_}-{wp_}-{config_}'
         h = Hist1D(sample[hist_name])
         if rebin:
             h.rebin(*rebin)
@@ -473,7 +503,7 @@ def simple_dist(hist_name, rebin=(), norm=1, xlabel="", ylabel="", xlim=None, yl
             if norm is not None:
                 h = h * (norm / h.integral)
             hist_plot(h, label=f'{sample_name} ($\\mu={mean:.2f}$)',
-                      color=color(proc, wp), line_width=line_width)
+                      color=color(proc_, wp_), line_width=line_width, linestyle=style(config_))
         except ZeroDivisionError:
             pass
     if xlim:
@@ -482,7 +512,7 @@ def simple_dist(hist_name, rebin=(), norm=1, xlabel="", ylabel="", xlim=None, yl
         plt.ylim(ylim)
     plt.xlabel(xlabel)
     plt.ylabel(ylabel)
-    plt.legend()
+    plt.legend(fontsize=20)
 
 
 @mpb.decl_fig
@@ -503,136 +533,126 @@ def all_cut_plots(build=True, publish=False):
         'tracking_roc_curve': plot_roc_curve('tracking'),
         'tracking_roc_curve_dR': plot_roc_curve('tracking', ext='_dR'),
         'seeding_roc_curve': plot_roc_curve('seed'),
-
-        # 'number_of_seeds': simple_dist('n_seeds', xlabel='Number of Seeds', rebin=(50, -0.5, 200.5)),
-        # 'number_of_good_seeds': simple_dist('n_good_seeds', xlabel='Number of Seeds', rebin=(50, -0.5, 200.5)),
-        # 'number_of_scls': simple_dist('n_scl', xlabel='Number of Super-Clusters', xlim=(-0.5, 25.5)),
-        # 'number_of_good_scls': simple_dist('n_good_scl', xlabel='Number of Super-Clusters', xlim=(-0.5, 25.5)),
-        #
-        # 'number_of_sim_els': simple_dist('n_good_sim', xlabel='Number of prompt(ish) electrons', xlim=(-0.5, 20.5)),
-        # 'number_of_gsf_tracks': simple_dist('n_gsf_track', xlabel='Number of reco electrons', xlim=(-0.5, 20.5)),
-        #
-        # 'number_of_prompt': simple_dist('n_prompt', xlabel='Number of prompt electrons', xlim=(-0.5, 20.5)),
-        # 'number_of_nonprompt': simple_dist('n_nonprompt', xlabel='Number of nonprompt electrons', xlim=(-0.5, 20.5)),
-        #
-        # 'number_of_matched': simple_dist('n_matched', xlabel='Number of matched electrons',
-        #                                  xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_merged': simple_dist('n_merged', xlabel='Number of merged electrons',
-        #                                 xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_lost': simple_dist('n_lost', xlabel='Number of lost electrons',
-        #                               xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_split': simple_dist('n_split', xlabel='Number of split electrons',
-        #                                xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_faked': simple_dist('n_faked', xlabel='Number of faked electrons',
-        #                                xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_flipped': simple_dist('n_flipped', xlabel='Number of flipped electrons',
-        #                                  xlim=(-0.5, 10.5), line_width=4),
-        # 'matched_dR': simple_dist('matched_dR', xlabel='dR between sim and reco'),
-        # 'matched_dpT': simple_dist('matched_dpT', xlabel='dpT between sim and reco'),
-        #
-        #
-        # 'number_of_matched_dR': simple_dist('n_matched_dR', xlabel='Number of matched electrons - dR Matched',
-        #                                     xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_merged_dR': simple_dist('n_merged_dR', xlabel='Number of merged electrons - dR Matched',
-        #                                    xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_lost_dR': simple_dist('n_lost_dR', xlabel='Number of lost electrons - dR Matched',
-        #                                  xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_split_dR': simple_dist('n_split_dR', xlabel='Number of split electrons - dR Matched',
-        #                                   xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_faked_dR': simple_dist('n_faked_dR', xlabel='Number of faked electrons - dR Matched',
-        #                                   xlim=(-0.5, 10.5), line_width=4),
-        # 'number_of_flipped_dR': simple_dist('n_flipped_dR', xlabel='Number of flipped electrons - dR Matched',
-        #                                     xlim=(-0.5, 10.5), line_width=4),
-        # 'matched_dR_dR': simple_dist('matched_dR_dR', xlabel='dR between sim and reco - dR Matched'),
-        # 'matched_dpT_dR': simple_dist('matched_dpT_dR', xlabel='dpT between sim and reco - dR Matched'),
-        #
-        # # 'tm_corr': simple_dist2d('tm_corr', 'dy', 'old-default', xlabel='Seed Matched', ylabel='Track Matched', norm=1),
-        #
-        # 'ecal_rel_res': plot_ecal_rel_res(),
-        # # 'hit_v_layer_BPIX_new-default_dy': plot_hit_vs_layer(('dy', 'new-default-skip'), 'barrel'),
-        # # 'hit_v_layer_FPIX_new-default_dy': plot_hit_vs_layer(('dy', 'new-default-skip'), 'forward'),
-        # # 'hit_v_layer_BPIX_new-default_tt': plot_hit_vs_layer(('tt', 'new-default-skip'), 'barrel'),
-        # # 'hit_v_layer_FPIX_new-default_tt': plot_hit_vs_layer(('tt', 'new-default-skip'), 'forward'),
-        # # 'hit_v_layer_BPIX_new-wide_dy': plot_hit_vs_layer(('dy', 'new-wide-skip'), 'barrel'),
-        # # 'hit_v_layer_FPIX_new-wide_dy': plot_hit_vs_layer(('dy', 'new-wide-skip'), 'forward'),
-        # # 'hit_v_layer_BPIX_new-wide_tt': plot_hit_vs_layer(('tt', 'new-wide-skip'), 'barrel'),
-        # # 'hit_v_layer_FPIX_new-wide_tt': plot_hit_vs_layer(('tt', 'new-wide-skip'), 'forward'),
-        #
-        #
-        # 'seed_kinem': plot_kinematic_eff('seed', norm=1, ylim=(0, None), bins_eta=30, bins_phi=30),
-        # 'scl_kinem': plot_kinematic_eff('scl', norm=1, ylim=(0, None), bins_eta=30, bins_phi=30),
-        # 'prompt_kinem': plot_kinematic_eff('prompt', norm=1, ylim=(0, None), bins_pt=30, bins_eta=30, bins_phi=30),
-        # 'nonprompt_kinem': plot_kinematic_eff('nonprompt', norm=1, ylim=(0, None), xlim_pt=(0, 5),
-        #                                       bins_eta=30, bins_phi=30),
     }
+    for proc, config in product(procs, configs):
+
+        figures[f'number_of_seeds_{proc}_{config}'] = simple_dist('n_seeds', rebin=(50, -0.5, 200.5),
+                                                                  xlabel='Number of Seeds', proc=proc, config=config)
+        figures[f'number_of_good_seeds_{proc}_{config}'] = simple_dist('n_good_seeds', rebin=(50, -0.5, 200.5),
+                                                                       xlabel='Number of Seeds', proc=proc, config=config)
+        figures[f'number_of_scls_{proc}_{config}'] = simple_dist('n_scl', xlabel='Number of Super-Clusters',
+                                                                 xlim=(-0.5, 25.5), proc=proc, config=config)
+        figures[f'number_of_good_scls_{proc}_{config}'] = simple_dist('n_good_scl', xlabel='Number of Super-Clusters',
+                                                                      xlim=(-0.5, 25.5), proc=proc, config=config)
+
+        figures[f'number_of_sim_els_{proc}_{config}'] = simple_dist('n_good_sim', xlabel='Number of prompt(ish) electrons',
+                                                                    xlim=(-0.5, 20.5), proc=proc, config=config)
+        figures[f'number_of_gsf_tracks_{proc}_{config}'] = simple_dist('n_gsf_track', xlabel='Number of reco electrons',
+                                                                       xlim=(-0.5, 20.5), proc=proc, config=config)
+
+        figures[f'number_of_prompt_{proc}_{config}'] = simple_dist('n_prompt', xlabel='Number of prompt electrons',
+                                                                   xlim=(-0.5, 20.5), proc=proc, config=config)
+        figures[f'number_of_nonprompt_{proc}_{config}'] = simple_dist('n_nonprompt', xlabel='Number of nonprompt electrons',
+                                                                      xlim=(-0.5, 20.5), proc=proc, config=config)
+
+        figures[f'number_of_matched_{proc}_{config}'] = simple_dist('n_matched_dR', xlabel='Number of matched electrons',
+                                                                    xlim=(-0.5, 10.5), line_width=4, proc=proc, config=config)
+        figures[f'number_of_merged_{proc}_{config}'] = simple_dist('n_merged_dR', xlabel='Number of merged electrons',
+                                                                   xlim=(-0.5, 10.5), line_width=4, proc=proc, config=config)
+        figures[f'number_of_lost_{proc}_{config}'] = simple_dist('n_lost_dR', xlabel='Number of lost electrons',
+                                                                 xlim=(-0.5, 10.5), line_width=4, proc=proc, config=config)
+        figures[f'number_of_split_{proc}_{config}'] = simple_dist('n_split_dR', xlabel='Number of split electrons',
+                                                                  xlim=(-0.5, 10.5), line_width=4, proc=proc, config=config)
+        figures[f'number_of_faked_{proc}_{config}'] = simple_dist('n_faked_dR', xlabel='Number of faked electrons',
+                                                                  xlim=(-0.5, 10.5), line_width=4, proc=proc, config=config)
+        figures[f'number_of_flipped_{proc}_{config}'] = simple_dist('n_flipped_dR', xlabel='Number of flipped electrons',
+                                                                    xlim=(-0.5, 10.5), line_width=4, proc=proc, config=config)
+        figures[f'matched_dR_{proc}_{config}'] = simple_dist('matched_dR', xlabel='dR between sim and reco',
+                                                             proc=proc, config=config)
+        figures[f'matched_dpT_{proc}_{config}'] = simple_dist('matched_dpT', xlabel='dpT between sim and reco',
+                                                              proc=proc, config=config)
+
+        for wp in wps:
+            figures[f'hit_v_layer_BPIX_{proc}_{wp}_{config}'] = plot_hit_vs_layer((proc, wp, config), 'barrel')
+            figures[f'hit_v_layer_FPIX_{proc}_{wp}_{config}'] = plot_hit_vs_layer((proc, wp, config), 'forward')
+
+    for proc, config in product(procs, configs):
+        bins_pt = np.linspace(0, 100, 30)
+        figures[f'good_sim_kinem_{proc}_{config}'] = plot_kinematic_eff('good_sim', proc, config, norm=1, ylim=(0, None),
+                                                                        bins_eta=30, bins_phi=30, bins_pt=bins_pt)
+        figures[f'good_sim_kinem2d_{proc}_{config}'] = plot_kinematic_eff2d('good_sim', proc, config)
+
+        figures[f'gsf_track_kinem_{proc}_{config}'] = plot_kinematic_eff('gsf_track', proc, config, norm=1, ylim=(0, None),
+                                                                         bins_eta=30, bins_phi=30, bins_pt=bins_pt)
+        figures[f'gsf_track_kinem2d_{proc}_{config}'] = plot_kinematic_eff2d('gsf_track', proc, config)
+
+        figures[f'gsf_high_pt1_kinem_{proc}_{config}'] = plot_kinematic_eff('gsf_high_pt1', proc, config,
+                                                                            norm=1, ylim=(0, None), bins_eta=30,
+                                                                            bins_phi=30, bins_pt=bins_pt)
+        figures[f'gsf_high_pt1_kinem2d_{proc}_{config}'] = plot_kinematic_eff2d('gsf_high_pt1', proc, config)
+
+        figures[f'gsf_high_pt2_kinem_{proc}_{config}'] = plot_kinematic_eff('gsf_high_pt2', proc, config,
+                                                                            norm=1, ylim=(0, None), bins_eta=30,
+                                                                            bins_phi=30, bins_pt=bins_pt)
+        figures[f'gsf_high_pt2_kinem2d_{proc}_{config}'] = plot_kinematic_eff2d('gsf_high_pt2', proc, config)
 
-    for config in configs:
-        figures[f'good_sim_kinem_{config}'] = plot_kinematic_eff('good_sim', norm=1, ylim=(0, None),
-                                                                 bins_eta=30, bins_phi=30, config=config)
-        figures[f'gsf_track_kinem_{config}'] = plot_kinematic_eff('gsf_track', norm=1, ylim=(0, None),
-                                                                  bins_eta=30, bins_phi=30, config=config)
 
     # for proc, wp, region in product(procs, wps, ('BPIX', 'FPIX')):
     #     figures[f'hit_v_layer_{region}_{wp}_{proc}'] = plot_hit_vs_layer((proc, wp), region)
 
-    def add_num_den(key, func, args, kwargs):
-        base_ext = kwargs.get('ext', '')
-        bins_pt_ = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300]
-        kwargs['bins_pt'] = kwargs.get('bins_pt', bins_pt_)
-        kwargs['bins_eta'] = kwargs.get('bins_eta', 15)
-        kwargs['bins_phi'] = kwargs.get('bins_phi', 15)
-        figures[key] = func(*args, ylim=(0, 1.1), is_ratio=True, **kwargs)
-        kwargs_ = kwargs.copy()
-        kwargs_['ext'] = base_ext+'_num'
-        figures[key+'_num'] = func(*args, **kwargs_)
-        kwargs_ = kwargs.copy()
-        kwargs_['ext'] = base_ext+'_den'
-        figures[key+'_den'] = func(*args, **kwargs_)
-
-    # add_num_den('tracking_eff', plot_kinematic_eff, ('tracking_eff',), dict(incl_sel=False))
-    # add_num_den('tracking_pur', plot_kinematic_eff, ('tracking_pur',), dict(incl_sel=False))
-    add_num_den('tracking_eff_dR', plot_kinematic_eff, ('tracking_eff',), dict(ext='_dR', incl_sel=False))
-    add_num_den('tracking_pur_dR', plot_kinematic_eff, ('tracking_pur',), dict(ext='_dR', incl_sel=False))
-    # add_num_den('prompt_eff', plot_kinematic_eff, ('prompt_eff',), dict(incl_sel=False))
-    # add_num_den('prompt_pur', plot_kinematic_eff, ('prompt_pur',), dict(incl_sel=False))
-    add_num_den('prompt_eff_dR', plot_kinematic_eff, ('prompt_eff',), dict(ext='_dR', incl_sel=False))
-    add_num_den('prompt_pur_dR', plot_kinematic_eff, ('prompt_pur',), dict(ext='_dR', incl_sel=False))
-    # add_num_den('nonprompt_eff', plot_kinematic_eff, ('nonprompt_eff',), dict(incl_sel=False))
-    # add_num_den('nonprompt_pur', plot_kinematic_eff, ('nonprompt_pur',), dict(incl_sel=False))
-    add_num_den('nonprompt_eff_dR', plot_kinematic_eff, ('nonprompt_eff',), dict(ext='_dR', incl_sel=False))
-    add_num_den('nonprompt_pur_dR', plot_kinematic_eff, ('nonprompt_pur',), dict(ext='_dR', incl_sel=False))
-
-    add_num_den('seeding_eff', plot_kinematic_eff, ('seed_eff',), dict(incl_sel=False))
-    add_num_den('seeding_pur', plot_kinematic_eff, ('seed_pur',), dict(incl_sel=False))
-    #
-    # add_num_den('fake_rate_incl', plot_kinematic_eff, ('fake_rate_incl',), {})
-    # add_num_den('fake_rate_no_e_match_incl', plot_kinematic_eff, ('fake_rate_no_e_match_incl',), {})
-    # add_num_den('partial_fake_rate_incl', plot_kinematic_eff, ('partial_fake_rate_incl',), {})
-    # add_num_den('full_fake_rate_incl', plot_kinematic_eff, ('full_fake_rate_incl',), {})
-    # add_num_den('clean_fake_rate_incl', plot_kinematic_eff, ('clean_fake_rate_incl',), {})
-    #
-    # add_num_den('fake_rate', plot_kinematic_eff, ('fake_rate',), dict(incl_sel=False))
-    # add_num_den('fake_rate_no_e_match', plot_kinematic_eff, ('fake_rate_no_e_match',), dict(incl_sel=False))
-    # add_num_den('partial_fake_rate', plot_kinematic_eff, ('partial_fake_rate',), dict(incl_sel=False))
-    # add_num_den('full_fake_rate', plot_kinematic_eff, ('full_fake_rate',), dict(incl_sel=False))
-    # add_num_den('clean_fake_rate', plot_kinematic_eff, ('clean_fake_rate',), dict(incl_sel=False))
-    #
-    # # hit_layers = [(1, 1), (1, 2), (2, 2), (2, 3), (3, 3), (3, 4)]
-    # # for proc, wp, (hit, layer), var, subdet in product(['dy', 'tt'], ['new-default-noskip', 'new-wide-noskip'],
-    # #                                                    hit_layers, ['dPhi', 'dRz'], ['BPIX', 'FPIX']):
-    # #     figures[f'res_{subdet}_L{layer}_H{hit}_{var}_{proc}_{wp}'] = plot_residuals((proc, wp), layer, hit, var, subdet)
-    #
-    # # rel_layers = {1: [('BPIX', 1), ('BPIX', 2), ('FPIX', 1), ('FPIX', 2)],
-    # #               2: [('BPIX', 2), ('BPIX', 3), ('FPIX', 2), ('FPIX', 3)],
-    # #               3: [('BPIX', 3), ('BPIX', 4), ('FPIX', 3)], }
-    # # for proc, hit, var in product(['dy', 'tt'], [1, 2, 3], ['dPhi', 'dRz']):
-    # #     figures[f'resall_H{hit}_{var}_{proc}'] = plot_res_contour(proc, hit, var, rel_layers[hit])
-    #
-    # # for proc, wp, hit, var in product(['dy', 'tt'], ['new-default-noskip', 'new-wide-noskip'], [1, 2, 3], ['dPhi', 'dRz']):
-    # #     figures[f'res_v_eta_H{hit}_{var}_{proc}_{wp}'] = plot_residuals_eta((proc, wp), hit, var)
-    #
-    # # figures = {}
-    #
+    for proc, config in product(procs, configs):
+        def add_num_den(key, func, args, kwargs):
+            key = f'{key}_{proc}_{config}'
+            base_ext = kwargs.get('ext', '')
+            bins_pt_ = [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 300]
+            kwargs['bins_pt'] = kwargs.get('bins_pt', bins_pt_)
+            kwargs['bins_eta'] = kwargs.get('bins_eta', 15)
+            kwargs['bins_phi'] = kwargs.get('bins_phi', 15)
+            args = *args, proc, config
+            figures[key] = func(*args, ylim=(0, 1.1), is_ratio=True, **kwargs)
+            kwargs_ = kwargs.copy()
+            kwargs_['ext'] = base_ext+'_num'
+            figures[key+'_num'] = func(*args, ylim=(0, None), **kwargs_)
+            kwargs_ = kwargs.copy()
+            kwargs_['ext'] = base_ext+'_den'
+            figures[key+'_den'] = func(*args, ylim=(0, None), **kwargs_)
+
+        add_num_den('tracking_eff_dR', plot_kinematic_eff, ('tracking_eff',), dict(ext='_dR', incl_sel=False))
+        add_num_den('tracking_pur_dR', plot_kinematic_eff, ('tracking_pur',), dict(ext='_dR', incl_sel=False))
+        # add_num_den('prompt_eff_dR', plot_kinematic_eff, ('prompt_eff',), dict(ext='_dR', incl_sel=False))
+        # add_num_den('prompt_pur_dR', plot_kinematic_eff, ('prompt_pur',), dict(ext='_dR', incl_sel=False))
+        # add_num_den('nonprompt_eff_dR', plot_kinematic_eff, ('nonprompt_eff',), dict(ext='_dR', incl_sel=False))
+        # add_num_den('nonprompt_pur_dR', plot_kinematic_eff, ('nonprompt_pur',), dict(ext='_dR', incl_sel=False))
+
+        add_num_den('seeding_eff', plot_kinematic_eff, ('seed_eff',), dict(incl_sel=False))
+        add_num_den('seeding_pur', plot_kinematic_eff, ('seed_pur',), dict(incl_sel=False))
+
+    add_num_den('fake_rate_incl', plot_kinematic_eff, ('fake_rate_incl',), {})
+    add_num_den('fake_rate_no_e_match_incl', plot_kinematic_eff, ('fake_rate_no_e_match_incl',), {})
+    add_num_den('partial_fake_rate_incl', plot_kinematic_eff, ('partial_fake_rate_incl',), {})
+    add_num_den('full_fake_rate_incl', plot_kinematic_eff, ('full_fake_rate_incl',), {})
+    add_num_den('clean_fake_rate_incl', plot_kinematic_eff, ('clean_fake_rate_incl',), {})
+
+    add_num_den('fake_rate', plot_kinematic_eff, ('fake_rate',), dict(incl_sel=False))
+    add_num_den('fake_rate_no_e_match', plot_kinematic_eff, ('fake_rate_no_e_match',), dict(incl_sel=False))
+    add_num_den('partial_fake_rate', plot_kinematic_eff, ('partial_fake_rate',), dict(incl_sel=False))
+    add_num_den('full_fake_rate', plot_kinematic_eff, ('full_fake_rate',), dict(incl_sel=False))
+    add_num_den('clean_fake_rate', plot_kinematic_eff, ('clean_fake_rate',), dict(incl_sel=False))
+
+    # hit_layers = [(1, 1), (1, 2), (2, 2), (2, 3), (3, 3), (3, 4)]
+    # for proc, wp, (hit, layer), var, subdet in product(['dy', 'tt'], ['new-default-noskip', 'new-wide-noskip'],
+    #                                                    hit_layers, ['dPhi', 'dRz'], ['BPIX', 'FPIX']):
+    #     figures[f'res_{subdet}_L{layer}_H{hit}_{var}_{proc}_{wp}'] = plot_residuals((proc, wp), layer, hit, var, subdet)
+
+    # rel_layers = {1: [('BPIX', 1), ('BPIX', 2), ('FPIX', 1), ('FPIX', 2)],
+    #               2: [('BPIX', 2), ('BPIX', 3), ('FPIX', 2), ('FPIX', 3)],
+    #               3: [('BPIX', 3), ('BPIX', 4), ('FPIX', 3)], }
+    # for proc, hit, var in product(['dy', 'tt'], [1, 2, 3], ['dPhi', 'dRz']):
+    #     figures[f'resall_H{hit}_{var}_{proc}'] = plot_res_contour(proc, hit, var, rel_layers[hit])
+
+    # for proc, wp, hit, var in product(['dy', 'tt'], ['new-default-noskip', 'new-wide-noskip'], [1, 2, 3], ['dPhi', 'dRz']):
+    #     figures[f'res_v_eta_H{hit}_{var}_{proc}_{wp}'] = plot_residuals_eta((proc, wp), hit, var)
+
     def add_simple_plot(proc, wp, config, plot_name, xlabel, ylabel, is2d=False, xlim=None, ylim=None, clear_zero=False):
         if is2d:
             figures[f'{plot_name}_{proc}_{wp}_{config}'] = \
@@ -716,9 +736,9 @@ if __name__ == '__main__':
     set_defaults()
     mpb.configure(output_dir='seeding_studies',
                   multiprocess=True,
-                  publish_remote="caleb@fangmeier.tech",
-                  publish_dir="/var/www/eg",
-                  publish_url="eg.fangmeier.tech",
+                  publish_remote="cfangmeier@t3.unl.edu",
+                  publish_dir="/home/dominguez/cfangmeier/public_html/eg/",
+                  publish_url="t3.unl.edu/~cfangmeier/eg/",
                   early_abort=True,
                   )
     procs = {
@@ -735,6 +755,6 @@ if __name__ == '__main__':
         # 'new-narrow': 'HLT Settings',
         'new-default': 'HLT Settings',
         'new-wide': 'Wide Settings',
-        # 'new-extra-wide': 'Extra Wide Settings',
+        'new-extra-wide': 'Extra Wide Settings',
     }
     all_cut_plots(build=args.build, publish=args.publish)

+ 89 - 0
plotting/examine_seeds.py

@@ -0,0 +1,89 @@
+
+from collections import defaultdict
+from uproot import open as root_open
+
+
+def main():
+    f_old = root_open('trackingNtuple_old_default.root')['trackingNtuple/tree']
+    f_new = root_open('trackingNtuple_new_default.root')['trackingNtuple/tree']
+
+    keys = [b'see_sclIdx', b'see_trkIdx',
+            b'scl_e', b'scl_px', b'scl_py', b'scl_pz', b'scl_hoe',
+            b'trk_q']
+    arrs_old = f_old.arrays(keys)
+    arrs_new = f_new.arrays(keys)
+
+    def dump_event(event, name):
+        print('-'*20 + f'{name:10}' + '-'*20)
+        # print(event[b'scl_hoe'] <= 0.15)
+
+        def get_cols(*strs):
+            en = enumerate(zip(*[event[s] for s in strs]))
+            return en
+
+        print('Seed Info')
+        for idx, (sclIdx, trkIdx) in get_cols('see_sclIdx', 'see_trkIdx'):
+            if sclIdx < 0: continue
+            if event['scl_hoe'][sclIdx] > 0.15: continue
+
+            trk_q = '-'
+            if trkIdx>=0:
+                trk_q = str(event["trk_q"][trkIdx])
+
+            print(f'{idx:3d}) {sclIdx:10d} {trk_q:10s}')
+        # print(event[b'see_sclIdx'])
+
+    def dump_scl(event):
+        def get_cols(*strs):
+            en = enumerate(zip(*[event[s] for s in strs]))
+            return en
+
+        print('Supercluster Info')
+        for idx, (e, px, py, pz, hoe) in get_cols('scl_e', 'scl_px', 'scl_py', 'scl_pz', 'scl_hoe'):
+            print(f'{idx:3d}) {hoe:10.2f} {e:10.2f}')
+
+
+    def seed_summary(event_old, event_new):
+        def get_cols(event, *strs):
+            en = enumerate(zip(*[event[s] for s in strs]))
+            return en
+
+        counts_old = defaultdict(int)
+        counts_new = defaultdict(int)
+        # print('Supercluster Info')
+        # for idx, (e, px, py, pz, hoe) in get_cols('scl_e', 'scl_px', 'scl_py', 'scl_pz', 'scl_hoe'):
+        #     print(f'{idx:3d}) {hoe:10.2f} {e:10.2f}')
+        print('Seed Info')
+        for _, (sclIdx,) in get_cols(event_old, 'see_sclIdx'):
+            if sclIdx >= 0:
+                # if event_old['scl_hoe'][sclIdx] > 0.15: continue
+                counts_old[sclIdx] += 1
+        for _, (sclIdx,) in get_cols(event_new, 'see_sclIdx'):
+            if sclIdx >= 0:
+                # if event_new['scl_hoe'][sclIdx] > 0.15: continue
+                counts_new[sclIdx] += 1
+
+        for idx, (e, px, py, pz, hoe) in get_cols(event_old, 'scl_e', 'scl_px', 'scl_py', 'scl_pz', 'scl_hoe'):
+            if hoe > 0.15: continue
+            print(f'{idx:3d}) {hoe:10.2f} {e:10.2f} {counts_old[idx]:10d} {counts_new[idx]:10d}')
+
+
+
+    nevt = len(arrs_old[keys[0]])
+    nevt = 5
+    for eIdx in range(nevt):
+        print(f'NEW EVENT: {eIdx}')
+        old = {key.decode(): arrs_old[key][eIdx] for key in keys}
+        new = {key.decode(): arrs_new[key][eIdx] for key in keys}
+        # dump_scl(old)
+        # dump_event(old, 'OLD')
+        # dump_event(new, 'NEW')
+        seed_summary(old, new)
+        # print(new[b'see_sclIdx'])
+
+
+
+
+
+if __name__ == '__main__':
+    main()