## Abstract

Hydrogen storage in porous geological formations is a potential option to mitigate offsets between power demand and generation in an energy system largely based on renewables. Incorporating hydrogen storage into the energy network requires the consideration of multiple scenarios for storage settings and potential loading cycles, causing a high computational effort. Therefore, homogenous replacement models are constructed by applying different spatial averaging methods for permeability and linearized relative permeability to an ensemble of heterogeneous reservoir representations of a potential hydrogen storage site. The applicability of these replacement models for approximating storage characteristics, such as well flow rates, pressure changes and power rates, is investigated by comparing their results to the results of the full heterogeneous ensemble. It is found that using the arithmetic mean to estimate the lateral and the harmonic mean for the vertical permeability in the homogeneous replacement models provides an approximation to the median of the heterogeneous ensemble for pressure changes, storage flow rate, gas in place and power output. Basic time-dependent effects of reducing well flow, and thus the power rates, during an extraction cycle can also be represented by these homogeneous replacement models. Using geometric means is found not to yield a valid representation of the storage behaviour.

Reducing greenhouse gas emissions, of which in 2010 around 34% was contributed by the energy sector, is a key component in mitigating climate change (IPCC 2014). One pathway to reduce these emissions in the energy sector is the switch from fossil fuel based to renewable power generation; with the target for the German ‘Energiewende’, for instance, being 80% renewable power generation by the year 2050 (BMWi 2016). However, power generation from renewable sources, such as wind or solar power, fluctuates strongly with time, resulting in power generation shortages during unfavourable weather conditions (e.g. Klaus *et al.* 2010; Heide *et al.* 2011).

By the mid-1970s the concept of storing electrical energy in the form of hydrogen, which can be either re-electrified or used directly for, for example, transportation, was introduced (Sørensen 1975). With increasing renewable power generation, the idea of using a chemical energy carrier such as hydrogen in a power-to-gas concept to mitigate the temporal offsets between power supply and demand regained traction (e.g. Sørensen *et al.* 2004; Gahleitner 2013; Varone & Ferrari 2015; Thema *et al.* 2016).

Experience with the bulk storage of natural gas has shown that underground gas storage in porous geological formations such as aquifers or depleted gas fields can be used to supply large storage capacities over long withdrawal periods (e.g. Gregory & Pangborn 1976; Ogden 1999; Evans & West 2008). Experience of storing H_{2} in geological formations, however, is very scarce. Practical experience in H_{2} storage in porous geological formations is limited to the storage of H_{2}-rich town gas (50% H_{2}) in aquifers in France (Beynes), Czechoslovakia (Lobodice) and Germany (Engelborstel, Bad Lauchstädt) (Carden & Paterson 1979; Foh *et al.* 1979; Panfilov 2010; Kruck *et al.* 2013). Several technical issues arise from the use of hydrogen instead of natural gas such as increased corrosion or degradation of technical installations, which are discussed in Reitenbach *et al.* (2015), as well as changes due to H_{2} dissolution in porewater (Li *et al.* 2018). In a number of technical and hydraulic aspects, underground storage of natural gas can be used as an analogue for storing H_{2} (Carden & Paterson 1979; Foh *et al.* 1979).

The geological subsurface is already used for a wide range of applications or technologies, ranging from groundwater abstraction or hydrocarbon exploration to small- and large-scale energy storage. As the suitable formations are in general limited in extent and availability, *a priori* assessments of induced effects and impacts on other uses should be carried out in addition to investigating potential storage dimensions (Bauer *et al.* 2013; Kabuth *et al.* 2017). This can be done through scenario simulations using realistically parameterized storage sites (Bauer *et al.* 2012), and has been shown exemplarily for CO_{2} storage (Benisch & Bauer 2013) and compressed air energy storage (Wang & Bauer 2017), as well as for H_{2} storage (Pfeiffer *et al.* 2017).

The magnitude of hydraulic effects induced by a storage operation depends on the operation schedule (i.e. the target withdrawal or injection rates and overall capacity), which itself depends in turn on the requirements of the energy system into which the storage site is embedded. Such future energy systems are not clearly defined now, which introduces uncertainty into the operational schedule and thus the required storage size. However, scenario simulations that consider the energy system, power plants and geological storage can identify potential usage cases for a specific storage option and location. This requires the evaluation of a large number of (storage) simulations, which is accompanied by a high computational effort. Furthermore, some uncertainty in the distribution of hydraulic parameters always has to be accounted for at any given storage site (e.g. Durlofsky 2005), typically requiring multiple simulation runs to generate an ensemble of realizations. This costly approach was employed in Pfeiffer *et al.* (2017) for a porous formation H_{2} storage case.

Thus, when considering the integration of a H_{2} storage site into the energy network, short runtimes of the storage simulations are required in order to assess different storage schedules and or different storage sizes. This can be achieved through reducing the model complexity of the storage simulations: for example, by aggregating regions of similar properties in a reservoir and assigning effective parameters for those, as is typically done during upscaling. With regard to this, a variety of methods have been discussed in the literature (e.g. King *et al.* 1993; Ringrose *et al.* 1993; Kumar & Jerauld 1996; Pickup & Hern 2002). A review of single-phase upscaling techniques is given in Renard & de Marsily (1997). For multiphase systems, typically more complex upscaling techniques are used that include capillary pressure and relative phase permeability effects (e.g. see Barker & Thibeau 1997; Christie 2001; Artus & Noettinger 2004; Pickup *et al.* 2005). However, no method exists that provides a truly exact upscaled model for a given multiphase flow problem (Ataie-Ashiani *et al.* 2001). The common idea of all upscaling methods is to retain the flow features of a heterogeneous porous formation, as well as to reduce the computational load (Christie 2001). The resulting upscaled models still resemble an, albeit less complex, heterogeneous simulation case and thus do not necessarily address the variability of the distribution of the reservoir heterogeneity, while at the same time providing a reduced spatial resolution for the investigation of induced hydraulic effects.

The aim of this work, therefore, is to derive homogenous replacement models for a given heterogeneous ensemble and to test their applicability when assessing dynamic storage performance. Such homogeneous replacement models would greatly reduce the number of simulation runs required to investigate varying storage set-ups for a given site. The heterogeneous ensemble used to generate the homogenous replacement models is presented in Pfeiffer *et al.* (2017) and represents a porous medium hydrogen storage case study. Results from the full heterogeneous ensemble are compared to results of different homogeneous replacement models using indicators such as gas in place, flow rates or power output for storage performance, as well as pressure change and gas-phase distribution for hydraulic effects.

Analytical averaging methods are used to obtain the effective hydraulic permeability and porosity, with the capillary pressure and relative permeability calculated based on a linearized capillary equilibrium upscaling. Averaging is applied to the full heterogeneous ensemble in a global scheme, so that for each geological unit and each tested averaging scheme a homogeneous property is obtained.

## Storage scenario and simulation set-up

The storage demand scenario used in this study is largely based on the scenario presented in Pfeiffer *et al.* (2017), which assumes week-long shortage periods in power production from renewable sources. Taking the total electricity consumption in the state of Schleswig-Holstein, Germany (with a population of *c*. 2.8 million people), of 47.79 × 10^{6} GJ in 2015 as reference (MELUR 2018), a total deficit of around 0.92 × 10^{6} GJ or 254 594 MWh must be provided during a week-long shortage period, with the average load being 1515 MW. This corresponds to an H_{2} volume of around 144 × 10^{6} Sm^{3} (standard cubic metre: defined here as 1 bar and 15°C), assuming the energy density of H_{2} to be about 124 MJ kg^{−1} (Carden & Paterson 1979) and the efficiency of the re-electrification process to be 0.6 for H_{2} (Büchi *et al.* 2014).

For the simulation of the hypothetical storage operation an existing structure located in the North German Basin (NGB) was used (Fig. 1), which was previously investigated for CO_{2} sequestration (Hese 2011, 2012) and is also described in Pfeiffer *et al.* (2016, 2017). At the storage site Rhaetian deposits, the Middle Jurassic and the Lower Cretaceous make up the storage complex, with the latter two being regarded barrier formations (Hese 2012). The depositional system of the Rhaetian varies spatially from a non-marine system in east of the NGB to a marine setting in the west (Doornenbal & Stevenson 2010). The location of the storage site within the NGB indicates a paralic system, with the deposits consisting of several distinct sandstone layers with intermediate shale–mudstone layers (Gaupp 1991; Doornenbal & Stevenson 2010; Hese 2011, 2012). In the NGB the Rhaetian deposits can be divided into the Lower, Upper and Middle Rhaetian (Battermann 1989; Gaupp 1991; Deutsche Stratigraphische Kommission (DSK) 2005). The Upper and Lower Rhaetian both show coarsening-upwards trends from mudstones near the respective bases to sandstones in the upper sections (Gaupp 1991; Deutsche Stratigraphische Kommission (DSK) 2005). The Middle Rhaetian can be further divided into the Lower Shale, the Main Sandstone, the Middle Shale, the Flaser Sandstones and the Upper Shale (Gaupp 1991; Deutsche Stratigraphische Kommission (DSK) 2005). Of the Rhaetian deposits, the Main Sandstone of the Middle Rhaetian is the target storage formation of this study, providing in general thicknesses of several metres and a relatively homogenous facies distribution (Fahrion & Betz 1991; Gaupp 1991; Deutsche Stratigraphische Kommission (DSK) 2005; Hese 2011, 2012). The storage structure includes several NW–SE-striking faults that were assumed tight and thus represent a closed hydraulic boundary. In Pfeiffer *et al.* (2017), the Upper, Middle and Lower Rhaetian were included in the heterogeneous ensemble model based on facies descriptions given by Gaupp (1991) and Hese (2011, 2012), with the Middle Rhaetian being differentiated in the Lower Shales, the Main Sandstone and the Upper Shales. In the absence of site-specific correlation lengths, 2500 and 500 m were assumed for the major and minor directions, respectively. The vertical structure of the individual units was included by generating vertical correlation trends corresponding to the coarsening-upwards described in the literature. With this, 25 equally likely realizations were created and used to investigate the variability of induced hydraulic effects of a H_{2} storage operation, as well as the performance of the storage site.

The shallower eastern flank of the structure was selected to be used for the storage operation. The storage operation was simulated using five vertical wells, labelled 1–5 from north to south (Fig. 1). Reference depths of the wells are 493, 481, 479, 463 and 474 m for wells 1–5. The maximum allowable bottom hole pressure (BHP) was set to 65 bars, corresponding to a maximum overpressure at the shallowest well of about 17.5 bars, which results in an overpressure gradient of 14 MPa km^{−1}. This is around 80% of the lower estimate given by Röckel & Lempp (2003) for the minimal horizontal stress for suprasalinar sediments of the NGB and below the average of 16.8 MPa km^{−1} for gas storage sites in Germany (Sedlacek 1999). The lower BHP limit was set to 30 bars to provide sufficient pressure for technical installations.

The simulated storage operation consists of an initial filling with N_{2} and subsequently H_{2} for a total of 920 days, so that the target N_{2} and H_{2} volumes in place are 39.5 × 10^{6} and 32.6 × 10^{6} Sm^{3}, respectively (Pfeiffer *et al.* 2017). Following the initial filling, the actual storage operation starts with six week-long withdrawal periods with a target power output of 370 MW, each followed by an injection of H_{2} over 50 days and a short shut-in of 30 days. After the six week-long storage periods, three storage cycles with 14 day withdrawal periods at the same target rate (370 MW) were simulated. These longer storage cycles include a refilling phase of 100 days and a shut-in period of 30 days. Thus, under ideal conditions the storage site should be able to supply 24% of the previously defined storage demand.

A simulation model of the eastern half of the structure was constructed with a lateral discretization of 50 × 50 m. Vertical discretization varies between 0.2 and 5 m. The barrier formations above and below the Rhaetian were assumed tight and thus excluded from the simulation. All simulations were carried using the Eclipse E300 simulator, which assesses immiscible two-phase flow. Although hydrodynamic dispersion is a factor that contributes to the mixing processes in multiphase flow systems (e.g. Feldmann *et al.* 2016), gas in gas dispersion cannot be represented in the model. Fluid properties were calculated using a generalized Peng–Robinson equation of state (Schlumberger 2015). With the equation of state (EOS) parameters given in Pfeiffer *et al.* (2017), densities and viscosities are within the uncertainty interval of the reference values given in Lemmon *et al.* (2016).

An initial hydrostatic pressure distribution was applied and closed hydraulic boundary conditions were set along most parts of the western model boundary to account for the fault system. Apart from the regions where faults dictate the boundary conditions, the effect of the extent of Rhaetian beyond the model domain was included by increasing the pore volume and decreasing the permeability of the boundary elements. The factors were determined based on the distribution of the Rhaetian given in Baldschuhn *et al.* (2001) and assuming a constant thickness. The additional extent of the storage formation accounted for was 48, 20, 14 and 20 km towards the north, east, south and west, respectively. More details on the storage scenario and the geological simulation set-up are given in Pfeiffer *et al.* (2017).

## Homogenous replacement models

The aim of this work is to derive, construct and test homogenous replacement models given a full heterogeneous ensemble. For this, effective homogeneous parameters must be found for rock properties such as intrinsic permeability and porosity, as well as for capillary pressure and relative permeability functions.

Renard & de Marsily (1997) gave a comprehensive overview of different methods of varying complexity to calculate equivalent permeabilities for single-phase flow systems. In systems with steady-state fluid flow parallel or perpendicular to the boundaries of internally homogenous rock layers, a correct estimation of the effective permeability can be obtained in the form of the weighted arithmetic (equation 1) or harmonic mean (equation 2) of permeability for the former and latter case, respectively (e.g. Bear 1972):*n* denoting the sample size, *k _{i}* the permeability and

*t*the thickness of layer

_{i}*i*. Even though the underlying assumptions of fluid flow parallel or perpendicular to the orientation of rock layers of constant thickness often do not hold, these formulas provide estimates of the upper and lower bounds for the true effective permeability (e.g. Renard & de Marsily 1997). For more complex rock geometries and fluid-flow patterns, the geometric mean (equation 3) is proposed as a good approximation in random permeability fields with a log-normal distribution and a low variance (e.g. Ringrose & Bentley 2015):

*x*-direction:

*et al.*(1978) showed that the effective permeability is given by:

Averaging was applied to each of the five geological units separately. The individual sample size is thus the grid cell number of each specific unit. When combining different averaging schemes (i.e. for replacement models #2, 4 and 5) (see Table 1), first the average along each vertical cell column within a unit was calculated with either an arithmetic mean (model #5) or a harmonic mean (models #2 and #4). Subsequently, the mean value of each column was averaged in the lateral direction to obtain the final effective homogeneous permeability. The reason for this approach is to reduce the effect of small-scale low-permeability zones during harmonic or geometric averaging (e.g. model #5) and to account for vertical anisotropy (models #2 and #4). Besides this, anisotropy in permeability was also considered for the homogeneous replacement models 1, 3, 5 and 7 by estimating the vertical permeability using the harmonic mean (not listed in Table 1).

Applying these averaging methods yields homogenous hydraulic permeabilities varying over orders of magnitude for a given horizon. This variation is smaller for comparatively homogeneous units, such as the Main Sandstone of the Middle Rhaetian, and higher for the highly heterogeneous Lower Rhaetian (cf. Pfeiffer *et al.* 2017).

As already noted before, porosity and the multiphase flow properties must also be averaged. Independent of the scheme to estimate hydraulic permeability, the mean porosity was calculated using the arithmetic mean of all porosity values for the individual formation of the heterogeneous ensemble. Several methods to obtain the effective capillary pressure and relative permeability functions are discussed in the literature, which can be generally differentiated into static (or steady-state) and dynamic methods (e.g. Barker & Thibeau 1997; Christie 2001). For dynamic upscaling, multiphase flow simulations are performed using small-scale models and scenario-specific boundary conditions, which are then used to generate pseudo-relative permeabilities for use in coarse-grid simulation models (Christie 2001). Static or steady-state methods assume that time derivatives of fluid flow can be neglected, resulting in constant fluid saturations. Static methods can be further subdivided into capillary equilibrium, gravity capillary equilibrium and viscous-limited upscaling methods, with the latter neglecting capillary forces and the former two flow-controlling the pressure gradients (e.g. Pickup & Stephen 2000; Christie 2001; Ringrose & Bentley 2015). In the gravity capillary equilibrium method, the fluid saturations are not solely controlled by capillary forces but also by the density contrasts within the fluid phases (Ringrose & Bentley 2015). The validity of the different static upscaling techniques depends on the balance of capillary, gravity and viscous forces, and thus the relationship between flow rates and capillary forces in the system (e.g. Ringrose *et al.* 1993; Pickup & Stephen 2000; Pickup *et al.* 2005). Dynamic methods are not considered in this work.

During storage filling and cyclic operation, flow rates within the storage formation vary significantly, complicating the identification of the appropriate method for the given case. Given the small saturation changes in the heterogeneous ensemble during the storage operation and the clear vertical separation of the phases (cf. Pfeiffer *et al.* 2017), saturations can be expected to be close to the individual endpoints (i.e. either at maximum gas or water saturation). Hence, relative permeability and capillary pressure data were upscaled using the capillary equilibrium scheme but relative phase permeabilities were linearized between the respective endpoint values (Table 2). The applicability of this approach for the given example is tested by comparing linearized and non-linearized simulation runs of equal absolute permeability.

To obtain the effective relative permeability and capillary pressure for the homogeneous replacement models using such a capillary equilibrium upscaling, phase saturations and the corresponding relative permeabilities are computed for all capillary pressure values. These relative permeabilities are then used to calculate effective phase permeabilities, which are subsequently converted to an effective relative permeability by dividing by the average of the absolute permeability. A more detailed description of this method is given in, for example, Pickup & Stephen (2000). By using arithmetic or harmonic averages to calculate the average absolute permeability, an anisotropic effective relative permeability can be obtained (Ringrose & Bentley 2015). In this study, only isotropic relative permeabilities were considered, which were calculated using the arithmetic average of the formation permeabilities (Table 2; Fig. 2). The dataset as shown in Figure 2, albeit in a linearized form, was used for all homogenous simulation runs, which consequently only differed in terms of absolute permeability.

## Evaluation of homogeneous replacement models

To test the applicability of the individual homogenous replacement models for the heterogeneous ensemble discussed in Pfeiffer *et al.* (2017), as listed in Table 1, simulations were performed for each homogeneous replacement model and the results obtained were compared to those of the heterogeneous ensemble. The metrics used to assess the replacement models are the spatial distribution and the magnitude of induced pressure changes, as well as the storage flow rate during withdrawal periods, the H_{2} fraction in the withdrawn gas, the resulting power output and the gas volume in place.

Apart from the hydraulic properties of the formations, all simulation data, such as well positions, phase properties and boundary conditions, are identical in all cases. Pressure changes are reported relative to baseline simulations with no storage operation as small-scale pressure variations occur because of the different endpoints in the capillary pressure curves. In the heterogeneous ensemble, the 95th percentile of given metric is taken as the value which 95% of the realizations reach, while the 5th percentile is only reached by 5% of the realizations. Thus, the 95th percentile is below the median, indicating the lower bound of the ensemble, and the 5th percentile is above the median, representing the upper bound.

### Induced hydraulic effects

The initial gas injection of N_{2} for the first 710 days and the subsequent injection of H_{2} for 210 days results in a sharp and pronounced pressure increase at and near the storage wells, with the latter reaching the upper BHP limit of 65 bars and thus an overpressure of nearly 20 bars in all cases (Figs 3 and 4). Because the well injection rates are automatically reduced by ECLIPSE when the pressure limit is reached, the near-field overpressure signals show strong similarities during the initial gas injection up until 920 days. Except for models #4, #7, #8 and #9 (Table 1), all show overpressures of around 11 bars, which puts them within the interval spanning from the 95th percentile to the median of the heterogeneous ensemble near the storage site (<1 km away) and thus in a relatively good agreement with the heterogeneous ensemble. When permeability is estimated by the geometric average of the vertical harmonic means (model #4), by a harmonic mean (model #7) or by using the Cardwell & Parsons (1945) bounds (models #8 and #9), the pressure drop within the first couple of hundred metres is significantly larger due to the lower homogeneous hydraulic permeability estimated by these methods, resulting in overpressures of just 5–7 bars (not shown). Consequently, the overpressures away from the storage wells during the storage initialization are underestimated compared to the heterogeneous ensemble when using these models. Given the bad fit to the ensemble, these methods are not further discussed in this section. During the initialization phase, considering anisotropy in the cases with a higher homogeneous permeability does not result in any significant differences in the observed overpressures, as all wells operate at the limiting BHPs (Fig. 3).

With increasing distance from the storage site, the pressure signal from the initial gas injections is dampened and pressure levels in the far field of the storage site are lower than the ensemble median in most of the tested homogenous replacement cases. Comparing the isotropic simulation cases shows that only model #1 (arithmetic average) provides a good approximation of the median pressure increase of the ensemble with differences of <0.5 bars (Fig. 3b). Considering anisotropy in the form of the vertical permeability estimated by the harmonic average results in higher overpressures than in the respective isotropic cases, with the anisotropic models #1, #5 and #6 providing results in good agreement with the median overpressure for the far field of the storage (Fig. 3b). The remaining isotropic and anisotropic cases exhibit pressures below the 95th percentile of the heterogeneous ensemble during the initial injection, with model #5 showing a close fit to the 95th percentile itself. Independent of the distance to the storage site, model #1, in which the arithmetic mean is used to estimate the permeability of the homogeneous replacement model, provides the best approximation for the median of the heterogeneous ensemble during the initial gas injection phase.

During the storage operation phase, the pressure levels in the near and far field of the storage site record the alternating withdrawal and injection of the stored gas (Fig. 3). The variability between the tested homogenous cases is larger in this phase, especially during the short withdrawal periods. In the near field of the storage site, the pressure levels are better approximated using averaging model #1 (anisotropic or isotropic), which has a high homogeneous replacement permeability (Fig. 3a). However, in the isotropic case, the amplitudes of the pressure changes between injection and withdrawal are underestimated by up to 1 bar during injection and 2 bars during withdrawal periods. Considering anisotropy results in pressure differences in the ensemble median being below 0.5 bars during the injection phases, while the differences during withdrawals are similar to the isotropic set-up. The low-permeability cases, models #3, #4, #7, #8 and #9 (i.e. cases with permeabilities equal or below those estimated by a arithmetic mean of the vertical harmonic means: model #2), show smaller pressure drops during the withdrawal periods than in the heterogeneous ensemble.

In the far field of the storage site the higher permeability simulations show a stronger pressure decline as the overpressure signals dissipate into the boundary conditions more quickly (Fig. 3b). Cases with permeabilities equal or below to model #6 show more persistent overpressure signals in the far field, so that pressure levels are higher than the median of the heterogeneous ensemble towards the end of the simulations. Considering anisotropy results in increased pressure fluctuations compared to the isotropic cases, with the amplitude being proportional to the lateral permeability and thus the permeability contrast (Fig. 3b). However, the difference in overpressures predicted by the tested averaging cases at the end of the storage operation is less than 0.5 bars over a 5000 m distance (Fig. 3b). Compared to the absolute pressure levels at this observation point, which exceeds 150 bars, these differences are considered negligible.

The distribution of the overpressure signal within the storage formation is not perfectly matched with any of the tested homogenous replacement models. Models #1, #5 and #6 provide a reasonable fit to the average pressure distribution in the heterogeneous ensemble (Fig. 4). In cases such as models #8 or #9 (Cardwell & Parsons 1945 bounds) or when a harmonic mean is used (model #7), the lower averaged permeability causes an underestimation of the extent of the pressure perturbation. This is due to the larger pressure drop associated with the lower permeabilities.

The injected gas accumulates near the top of the structure given its lower density compared to the surrounding formation water (Fig. 5a). The extent of the gas phase is affected by the overall volume of gas in place (Fig. 6), which differs depending on the time at which the upper BHP limit is reached during the injection phases for the individual cases. The overall extent of the gas phase in the heterogeneous ensemble is best matched in model #1, although the gas-phase saturation distribution is obviously more homogenous and in general higher than the median of the heterogeneous ensemble (Fig. 5a: #1). In this simulation case, the gas in place, and thus implicitly also the extent of the gas phase, is slightly overestimated compared to the ensemble median when isotropic conditions are assumed, while the anisotropic simulation provides a near-perfect fit (Fig. 6).

Within the gas phase, the concentration of the H_{2} component is highest at the wells, declining in a roughly concentric pattern (Fig. 5b). The best match with the average H_{2} distribution in the heterogeneous ensemble is given for model #1 (arithmetic mean) and model #5 (geometric mean of vertical arithmetic means) because the volume of H_{2} injected during storage initialization and operation closely matches the ensemble median (cf. Fig. 6).

For the given storage set-up, it is thus found that the hydraulic effects due to storage initialization and operation in terms of the median pressure perturbation and extent of the gas phase can be readily approximated using a homogenous replacement models based on model #1b (i.e. the arithmetic anisotropic case). The range of pressure perturbations in the heterogeneous ensemble is not well represented by the different homogeneous cases; thus the homogeneous replacement models cannot be used to identify the variability that must be expected.

### Storage performance

During the storage phase, large amounts of gas are injected and extracted during each storage cycle. Because the upper BHP limits are reached in all cases considered, the injected volume differs between the individual cases because well flow rates are limited to lower values in the homogeneous replacement models with a lower effective permeability. In these cases, the pressure limits are reached more quickly, resulting in a lower overall gas in place as well as H_{2} in place (Fig. 6). The direct comparison of the N_{2} and H_{2} volumes in place during the storage operation shows that model #1 in the isotropic set-up provides a good estimate of the 5th percentile of the heterogeneous ensemble throughout the simulation, while the median is well approximated using the corresponding anisotropic set-up (Fig. 6). Furthermore, the volumes in place in model #5 are a good approximation of the 95th percentile of the heterogeneous ensemble if isotropic conditions are assumed. All other cases investigated here (see #3 or #6 in Fig. 6) yield N_{2} and H_{2} volumes in place below the 95th percentile of the heterogeneous ensemble, and thus do not provide a good approximation for the storage scenario investigated here.

The achieved storage flow rate during the withdrawal periods is an important measure, as it limits storage performance. Storage flow rates are shown for all homogeneous replacement models, as well as the ensemble and the ensemble median in Figure 7. The ensemble median indicates that target flow rates are achieved within the first 4–5 days in cycle 9. After that, the median rate decreases slightly until the 11th day to about 70% of the target rate, and more strongly until the end of the extraction period at day 14 to about 35%. The ensemble results show a more uniform decline in storage flow rate towards the end of the storage cycles than the homogenous replacement models. This effect is partly due to the median of the heterogeneous ensemble being an averaged and thus already smoothed result of the ensemble simulations. The median storage flow rate of the heterogeneous ensemble is best represented using model #1 assuming anisotropic conditions (Fig. 7a). However, during the shorter withdrawal cycles this model overestimates the ensemble median, with the differences becoming smaller towards the end of the storage cycles. For the longer storage periods, considering an anisotropic permeability results in an underestimation of the ensemble median, while the assumption of isotropy results in an overestimation. Model #5 also underestimates the storage flow rates compared to the median but provides a good approximation of the 95th percentile of the ensemble, independent of the length of the storage cycles or whether anisotropy is considered (Fig. 7a). All remaining cases (e.g. models #3 and #6) generally underestimate the storage flow rates and thus can only serve as a conservative estimate of the achievable flow rates. Water production during the withdrawal phases is similar in the homogenous cases to the heterogeneous ensemble, at about 0.05% for the heterogeneous ensemble and model #1 (isotropic) in the first storage cycle, with the water fraction increasing to 0.07% when anisotropy is considered. In later cycles, water production is below 0.05% in the heterogeneous ensemble and 0.01% for model #1 (isotropic).

The H_{2} fractions in the produced gas varied significantly during the first couple of storage cycles, while at the same time showing only low variability in the heterogeneous ensemble. In later cycles, the H_{2} fractions remain above 80% as the volume of H_{2} in place is increasing with each storage cycle (cf. Fig. 7b). In the homogenous cases, H_{2} fractions in the produced gas are in general slightly below the median of the heterogeneous ensemble, with model #1 providing the best estimate of the ensemble median (Fig. 7b). Considering anisotropy in the homogeneous simulation runs does not affect this metric substantially.

The actual power the storage site can deliver is a function of the achieved storage flow rate and the H_{2} fraction in the withdrawn gas. Because the temporal variability of the H_{2} fraction in produced gas is low and all cases reasonably estimate this fraction, the storage power output is determined by the storage flow rate. This is especially true for the later storage cycles (Fig. 7c). Consequently, model #1 in isotropic configuration again falls in between the median and 5th percentile of the heterogeneous ensemble, with a tendency towards the median for longer withdrawal periods. For this case, the minimal H_{2} flow rate achieved is around 3.78 × 10^{6} Sm^{3}/day in storage cycle 6, the last of the shorter 7 day cycles. The corresponding sustainable power output, which is the minimum power achieved and thus the guaranteed storage power, is 279 MW (Fig. 7c). Because the storage flow rate decreases more strongly during the longer withdrawal periods (cycles 7–9), the sustainable power output is reduced to about 130 MW in the isotropic model #1 (Fig. 7c). Considering anisotropy reduces the achievable flow rates, so that the sustainable power output is only 243 and 93 MW in cycles 6 and 9 for model #1.

The remaining homogeneous replacement models yield sustainable power outputs that are below the ensemble median: for example, the isotropic model #5 shows sustainable power outputs of around 181 and 80 MW in storage cycles 6 and 9, respectively, providing an approximation of the 95th percentile of the ensemble (Fig. 7c) during the later withdrawal periods.

The amount of H_{2} produced in the individual storage cycles represents the total energy withdrawn from the storage site. With increasing H_{2} fractions and storage flow rates, this metric obviously increases with the number of storage cycles. Again, model #1 provides a reasonable estimate of the ensemble median when anisotropy is considered, with a total of around 30.6 × 10^{6} Sm^{3} H_{2} produced during storage cycle 6 and around 54.2 × 10^{6} Sm^{3} in cycle 9, corresponding to an energy output of around 54 115 and 95 700 MWh, respectively. Neglecting anisotropy yields simulation results that fall in between the 5th percentile and the median of the ensemble. The isotropic and anisotropic runs of model #6 can provide an estimate of the 95th percentile of the heterogeneous ensemble, which falls between the values obtained in the respective simulations. Using model #5 to estimate the 95th percentile of the ensemble in terms of energy output does leads to an overestimation.

The effect of using a non-linear relative permeability is investigated by considering the full non-linear relative permeability data obtained through capillary equilibrium upscaling (Fig. 2). Results show that the pressure changes obtained are very close to the cases with linearized relative permeability. However, storage flow rate and thus power output differ significantly for a given absolute permeability model when anisotropy is considered in the reservoir permeability (Fig. 8). In this case, the results of the non-linear anisotropic case show flow rates below those of the 95th percentile of the heterogeneous ensemble. If an isotropic permeability is assumed, the differences between the non-linear and linear set-ups diminish. The same is true for the remaining homogeneous cases (i.e. models #2–#9), which show flow rates below those of 95th percentile of the ensemble for the linear and non-linear set-ups. Thus, for the given scenario using the non-linear relative permeability data obtained by the capillary equilibrium upscaling provides no benefit over using linearized data.

## Summary and conclusions

In this work, a number of homogeneous replacement models are constructed by applying different spatial averaging schemes to a heterogeneous ensemble. Results from the homogeneous replacement models are then compared to the heterogeneous ensemble using indicators such as well flow rate, storage power output or induced overpressure, in order to assessed their usability for allowing faster assessment of storage performance and induced hydraulic effects by reducing the simulation run time. It is found that estimating the effective permeability of the replacement model by using the arithmetic mean (such as in model #1, for example), as well as assuming anisotropy by estimating the vertical permeability using the harmonic mean, yields pressure changes that are in good agreement with the median of the heterogeneous ensemble. A tendency to underestimate the pressure changes during the storage operation is also identified. Assuming isotropy provides a reasonable match to the median pressure changes but their amplitudes are less well matched than in the anisotropic set-up. Similarly, the spatial distribution of the induced pressure perturbations as well as the maximum extent of the gas phase and thus the extent of the directly induced hydraulic effects are best approximated using the same homogeneous replacement model. While local differences may be large, a reliable approximation of the spatial extent of the induced hydraulic effects is found in the investigated scenario, when hydraulic permeability is averaged using the arithmetic mean in the lateral direction and the harmonic mean in the vertical direction. One or a combination of the tested homogeneous replacement models cannot capture the variability in the heterogeneous ensemble. If the permeability of the homogeneous replacement model is estimated using an arithmetic mean of the vertical harmonic means (model #2), a geometric mean (model #3), the geometric mean of vertical harmonic means (model #4) and a pure harmonic mean (model #7), the overpressures are overestimated near the storage site and underestimated in the far field. Using either the Cardwell & Parsons (1945) bounds (models #8 and #9) or the method of Gutjahr *et al.* (1978) (model #6) equally results in misestimating the pressure perturbations caused by the storage operation.

In terms of storage performance metrics (i.e. gas in place, achievable storage flow rate and power output), it is found that the homogeneous replacement models in which the lateral permeability is estimated by an arithmetic mean and the vertical permeability by a harmonic mean (model #1 anisotropic) can be used to estimate the ensemble median. Neglecting anisotropy provides results that are a better fit to the 5th percentile of the ensemble. The 95th percentile can be reasonably approximated using a simulation case in which permeability is estimated by a geometric mean of the arithmetic means along each vertical grid cell column. The remaining averaging schemes tested here provide results that fall below the 95th percentile of the heterogeneous ensemble regarding storage performance metrics. It is found that the lower bound and, to some extent, also the variability of the heterogeneous ensemble in terms of gas in place, minimal storage flow rate and power output can be reasonably well approximated using just two cases (i.e. models #1 and #5). This is generally independent of the length of the storage cycles; however, the best fit is achieved towards the end of the cycles after about 7–9 days. For shorter storage cycles of less than 5 days, the lower bounds of the ensemble results cannot be estimated using the homogenous replacement models tested in this study.

The run times of the individual heterogeneous storage simulations vary depending on the complexity of the simulation (e.g. steep gradients, well pressures being close to limits tend to increase the simulation runtime). The same is true for the homogenous cases, so that individual runtimes might be similar in a direct comparison, especially when using upscaled non-linear relative permeability functions. However, the total runtime required to complete the heterogeneous ensemble simulations obviously exceeds the time required to run even a couple of homogeneous simulation cases to obtain the likely bounds of the median ensemble behaviour. Thus, homogeneous replacement models based on averaging permeability using an arithmetic mean and a geometric mean of arithmetic averages along each grid column, in combination with linearized relative permeabilities obtained by capillary-equilibrium-based upscaling, provide an estimation of the median and the lower bound of the given heterogeneous ensemble results in terms of induced hydraulic effects and storage performance.

Flow-based upscaling methods based on small-scale simulations with local boundary conditions (e.g. Guérillot *et al.* 1990) or large-scale models with global boundaries (e.g. Holden & Nielsen 2000) were not tested in this analysis but can potentially be used to improve the homogenous replacement models. Because this work is based on one geological setting so far, a direct transfer and successful application to other locations with other types of heterogeneity is not guaranteed. Thus, additional application studies would help to further corroborate the main finding reported here. However, the tested homogeneous replacement models based on simple analytical averaging schemes and linearized relative phase permeabilities are helpful for a first and computationally less expensive assessment of the storage behaviour and induced effects.

## Acknowledgements

We would like to acknowledge the very throughout reviews of two anonymous reviewers, which have helped us considerably to improve this manuscript.

## Funding

We gratefully acknowledge funding of this project provided by the German Federal Ministry for Economic Affairs and Energy (BMWi) under Grant No. 03ET6122A through the energy storage funding initiative ‘Energiespeicher’ of the German Federal Government.

- © 2019 The Author(s). Published by The Geological Society of London for GSL and EAGE. All rights reserved